Skip to main content
With the rapid advancement in healthcare, there has been exponential growth in the healthcare records stored in large databases to help researchers, clinicians, and medical practitioner’s for optimal patient care, research, and trials.... more
With the rapid advancement in healthcare, there has been exponential growth in the healthcare records stored in large databases to help researchers, clinicians, and medical practitioner’s for optimal patient care, research, and trials. Since these studies and records are lengthy and time consuming for clinicians and medical practitioners, there is a demand for new, fast, and intelligent medical information retrieval methods. The present study is a part of the project which aims to design an intelligent medical information retrieval and summarization system. The whole system comprises three main modules, namely adverse drug event classification (ADEC), medical named entity recognition (MNER), and multi-model text summarization (MMTS). In the current study, we are presenting the design of the ADEC module for classification tasks, where basic machine learning (ML) and deep learning (DL) techniques, such as logistic regression (LR), decision tree (DT), and text-based convolutional neura...
For multi-target tracking, target representation plays a crucial rule in performance. State-of-the-art approaches rely on the deep learning-based visual representation that gives an optimal performance at the cost of high computational... more
For multi-target tracking, target representation plays a crucial rule in performance. State-of-the-art approaches rely on the deep learning-based visual representation that gives an optimal performance at the cost of high computational complexity. In this paper, we come up with a simple yet effective target representation for human tracking. Our inspiration comes from the fact that the human body goes through severe deformation and inter/intra occlusion over the passage of time. So, instead of tracking the whole body part, a relative rigid organ tracking is selected for tracking the human over an extended period of time. Hence, we followed the tracking-by-detection paradigm and generated the target hypothesis of only the spatial locations of heads in every frame. After the localization of head location, a Kalman filter with a constant velocity motion model is instantiated for each target that follows the temporal evolution of the targets in the scene. For associating the targets in ...
In tracking-by-detection paradigm for multi-target tracking, target association is modeled as an optimization problem that is usually solved through network flow formulation. In this paper, we proposed combinatorial optimization... more
In tracking-by-detection paradigm for multi-target tracking, target association is modeled as an optimization problem that is usually solved through network flow formulation. In this paper, we proposed combinatorial optimization formulation and used a bipartite graph matching for associating the targets in the consecutive frames. Usually, the target of interest is represented in a bounding box and track the whole box as a single entity. However, in the case of humans, the body goes through complex articulation and occlusion that severely deteriorate the tracking performance. To partially tackle the problem of occlusion, we argue that tracking the rigid body organ could lead to better tracking performance compared to the whole body tracking. Based on this assumption, we generated the target hypothesis of only the spatial locations of person’s heads in every frame. After the localization of head location, a constant velocity motion model is used for the temporal evolution of the targe...
Today’s eLearning websites are heavily loaded with multimedia contents, which are often unstructured, unedited, unsynchronized, and lack inter-links among different multimedia components. Hyperlinking different media modality may provide... more
Today’s eLearning websites are heavily loaded with multimedia contents, which are often unstructured, unedited, unsynchronized, and lack inter-links among different multimedia components. Hyperlinking different media modality may provide a solution for quick navigation and easy retrieval of pedagogical content in media driven eLearning websites. In addition, finding meta-data information to describe and annotate media content in eLearning platforms is challenging, laborious, prone to errors, and time-consuming task. Thus annotations for multimedia especially of lecture videos became an important part of video learning objects. To address this issue, this paper proposes three major contributions namely, automated video annotation, the 3-Dimensional (3D) tag clouds, and the hyper interactive presenter (HIP) eLearning platform. Combining existing state-of-the-art SIFT together with tag cloud, a novel approach for automatic lecture video annotation for the HIP is proposed. New video ann...
This paper proposes a reference free perceptual quality metric for blackboard lecture images. The text in the image is mostly affected by high compression ratio and de-noising filters which cause blocking and blurring artifacts. As a... more
This paper proposes a reference free perceptual quality metric for blackboard lecture images. The text in the image is mostly affected by high compression ratio and de-noising filters which cause blocking and blurring artifacts. As a result the perceived text quality of the blackboard image degrades. The degraded text is not only difficult to read by humans but it also makes the optical character recognition task even more difficult. Therefore, we put our effort firstly to estimate the presence of these artifacts and then we used it in our proposed quality metric. The blocking and blurring features are extracted from the image content on block boundaries without the presence of reference image. Thus it makes our metric reference free. The metric also uses the visual saliency model to mimic the human visual system (HVS) by focusing only on the distortions in perceptually important regions, i.e. those regions which contains the text. Moreover psychophysical experiments are conducted t...
A new interest in the use of game factors while acquiring new knowledge has emerged, and a number of researchers are investigating the effectiveness of the game-based approach in education systems. Recent research in game-based learning... more
A new interest in the use of game factors while acquiring new knowledge has emerged, and a number of researchers are investigating the effectiveness of the game-based approach in education systems. Recent research in game-based learning suggests that this approach imparts learning by involving learners in the learning process. The game factors generate affective-cognitive reactions that absorb users in playing the game and positively influence the learning. This paper offers a comparison of the learning processes between the gamebased learning and pen-and-paper approaches. In this paper the analysis of both learning approaches is realized through a braincontrolled technology, using the Emotiv EEG Tech headset, by analyzing the stress, excitement, relaxation, focus, interest, and engagement that the learner is experiencing while going through both approaches.
It has been more than a year since the coronavirus (COVID-19) engulfed the whole world, disturbing the daily routine, bringing down the economies, and killing two million people across the globe at the time of writing. The pandemic... more
It has been more than a year since the coronavirus (COVID-19) engulfed the whole world, disturbing the daily routine, bringing down the economies, and killing two million people across the globe at the time of writing. The pandemic brought the world together to a joint effort to find a cure and work toward developing a vaccine. Much to the anticipation, the first batch of vaccines started rolling out by the end of 2020, and many countries began the vaccination drive early on while others still waiting in anticipation for a successful trial. Social media, meanwhile, was bombarded with all sorts of both positive and negative stories of the development and the evolving coronavirus situation. Many people were looking forward to the vaccines, while others were cautious about the side-effects and the conspiracy theories resulting in mixed emotions. This study explores users’ tweets concerning the COVID-19 vaccine and the sentiments expressed on Twitter. It tries to evaluate the polarity t...
This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY
This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY
In the last decade, sentiment analysis has been widely applied in many domains, including business, social networks and education. Particularly in the education domain, where dealing with and processing students’ opinions is a complicated... more
In the last decade, sentiment analysis has been widely applied in many domains, including business, social networks and education. Particularly in the education domain, where dealing with and processing students’ opinions is a complicated task due to the nature of the language used by students and the large volume of information, the application of sentiment analysis is growing yet remains challenging. Several literature reviews reveal the state of the application of sentiment analysis in this domain from different perspectives and contexts. However, the body of literature is lacking a review that systematically classifies the research and results of the application of natural language processing (NLP), deep learning (DL), and machine learning (ML) solutions for sentiment analysis in the education domain. In this article, we present the results of a systematic mapping study to structure the published information available. We used a stepwise PRISMA framework to guide the search proc...
In computer vision, traditional machine learning (TML) and deep learning (DL) methods have significantly contributed to the advancements of medical image analysis (MIA) by enhancing prediction accuracy, leading to appropriate planning and... more
In computer vision, traditional machine learning (TML) and deep learning (DL) methods have significantly contributed to the advancements of medical image analysis (MIA) by enhancing prediction accuracy, leading to appropriate planning and diagnosis. These methods substantially improved the diagnoses of automatic brain tumor and leukemia/blood cancer detection and can assist the hematologist and doctors by providing a second opinion. This review provides an in-depth analysis of available TML and DL techniques for MIA with a significant focus on leukocytes classification in blood smear images and other medical imaging domains, i.e., magnetic resonance imaging (MRI), CT images, X-ray, and ultrasounds. The proposed review’s main impact is to find the most suitable TML and DL techniques in MIA, especially for leukocyte classification in blood smear images. The advanced DL techniques, particularly the evolving convolutional neural networks-based models in the MIA domain, are deeply invest...
Data imbalance is a frequently occurring problem in classification tasks where the number of samples in one category exceeds the amount in others. Quite often, the minority class data is of great importance representing concepts of... more
Data imbalance is a frequently occurring problem in classification tasks where the number of samples in one category exceeds the amount in others. Quite often, the minority class data is of great importance representing concepts of interest and is often challenging to obtain in real-life scenarios and applications. Imagine a customers’ dataset for bank loans-majority of the instances belong to non-defaulter class, only a small number of customers would be labeled as defaulters, however, the performance accuracy is more important on defaulters labels than non-defaulter in such highly imbalance datasets. Lack of enough data samples across all the class labels results in data imbalance causing poor classification performance while training the model. Synthetic data generation and oversampling techniques such as SMOTE, AdaSyn can address this issue for statistical data, yet such methods suffer from overfitting and substantial noise. While such techniques have proved useful for synthetic...
How different cultures react and respond given a crisis is predominant in a society's norms and political will to combat the situation. Often, the decisions made are necessitated by events, social pressure, or the need of the hour,... more
How different cultures react and respond given a crisis is predominant in a society's norms and political will to combat the situation. Often, the decisions made are necessitated by events, social pressure, or the need of the hour, which may not represent the nation's will. While some are pleased with it, others might show resentment. Coronavirus (COVID-19) brought a mix of similar emotions from the nations towards the decisions taken by their respective governments. Social media was bombarded with posts containing both positive and negative sentiments on the COVID-19, pandemic, lockdown, and hashtags past couple of months. Despite geographically close, many neighboring countries reacted differently to one another. For instance, Denmark and Sweden, which share many similarities, stood poles apart on the decision taken by their respective governments. Yet, their nation's support was mostly unanimous, unlike the South Asian neighboring countries where people showed a lot o...
Students' feedback is an effective mechanism that provides valuable insights about teaching-learning process. Handling opinions of students expressed in reviews is a quite labour-intensive and tedious task as it is typically performed... more
Students' feedback is an effective mechanism that provides valuable insights about teaching-learning process. Handling opinions of students expressed in reviews is a quite labour-intensive and tedious task as it is typically performed manually by the human intervention. While this task may be viable for small-scale courses that involve just a few students' feedback, it is unpractical for large-scale cases as it applies to online courses in general, and MOOCs, in particular. Therefore, to address this issue, we propose in this paper a framework to automatically analyzing opinions of students expressed in reviews. Specifically, the framework relies on aspect-level sentiment analysis and aims to automatically identify sentiment or opinion polarity expressed towards a given aspect related to the MOOC. The proposed framework takes advantage of weakly supervised annotation of MOOC-related aspects and propagates the weak supervision signal to effectively identify the aspect categor...
MOOC represents an ultimate way to deliver educational content in higher education settings by providing high-quality educational material to the students throughout the world. Considering the differences between traditional learning... more
MOOC represents an ultimate way to deliver educational content in higher education settings by providing high-quality educational material to the students throughout the world. Considering the differences between traditional learning paradigm and MOOCs, a new research agenda focusing on predicting and explaining dropout of students and low completion rates in MOOCs has emerged. However, due to different problem specifications and evaluation metrics, performing a comparative analysis of state-of-the-art machine learning architectures is a challenging task. In this paper, we provide an overview of the MOOC student dropout prediction phenomenon where machine learning techniques have been utilized. Furthermore, we highlight some solutions being used to tackle with dropout problem, provide an analysis about the challenges of prediction models, and propose some valuable insights and recommendations that might lead to developing useful and effective machine learning solutions to solve the ...
Abstract This paper presents a semantically rich document representation model for automatically classifying financial documents into predefined categories utilizing deep learning. The model architecture consists of two main modules... more
Abstract This paper presents a semantically rich document representation model for automatically classifying financial documents into predefined categories utilizing deep learning. The model architecture consists of two main modules including document representation and document classification. In the first module, a document is enriched with semantics using background knowledge provided by an ontology and through the acquisition of its relevant terminology. Acquisition of terminology integrated to the ontology extends the capabilities of semantically rich document representations with an in depth-coverage of concepts, thereby capturing the whole conceptualization involved in documents. Semantically rich representations obtained from the first module will serve as input to the document classification module which aims at finding the most appropriate category for that document through deep learning. Three different deep learning networks each belonging to a different category of machine learning techniques for ontological document classification using a real-life ontology are used. Multiple simulations are carried out with various deep neural networks configurations, and our findings reveal that a three hidden layer feedforward network with 1024 neurons obtain the highest document classification performance on the INFUSE dataset. The performance in terms of F1 score is further increased by almost five percentage points to 78.10% for the same network configuration when the relevant terminology integrated to the ontology is applied to enrich document representation. Furthermore, we conducted a comparative performance evaluation using various state-of-the-art document representation approaches and classification techniques including shallow and conventional machine learning classifiers.
The wide use of ontology in different applications has resulted in a plethora of automatic approaches for population and enrichment of an ontology. Ontology enrichment is an iterative process where the existing ontology is continuously... more
The wide use of ontology in different applications has resulted in a plethora of automatic approaches for population and enrichment of an ontology. Ontology enrichment is an iterative process where the existing ontology is continuously updated with new concepts. A key aspect in ontology enrichment process is the concept learning approach. A learning approach can be a linguistic-based, statistical-based, or hybrid-based that employs both linguistic as well as statistical-based learning approaches. This chapter presents a concept enrichment model that combines contextual and semantic information of terms. The proposed model called SEMCON employs a hybrid concept learning approach utilizing functionalities from statistical and linguistic ontology learning techniques. The model introduced for the first time two statistical features that have shown to improve the overall score ranking of highly relevant terms for concept enrichment. The chapter also gives some recommendations and possibl...
Research Interests:
This paper proposes an improved concept vector space (ICVS) model which takes into account the importance of ontology concepts. Concept importance shows how important a concept is in an ontology. This is reflected by the number of... more
This paper proposes an improved concept vector space (ICVS) model which takes into account the importance of ontology concepts. Concept importance shows how important a concept is in an ontology. This is reflected by the number of relations a concept has to other concepts. Concept importance is computed automatically by converting the ontology into a graph initially and then employing one of the Markov based algorithms. Concept importance is then aggregated with concept relevance which is computed using the frequency of concept occurrences in the dataset. In order to demonstrate the applicability of our proposed model and to validate its efficacy, we conducted experiments on document classification using concept based vector space model. The dataset used in this paper consists of 348 documents from the funding domain. The results show that the proposed model yields higher classification accuracy comparing to traditional concept vector space (CVS) model, ultimately giving better document classification performance. We also used different classifiers in order to check for the classification accuracy. We tested CVS and ICVS on Naive Bayes and Decision Tree classifiers and the results show that the classification performance in terms of F1 measure is improved when ICVS is used on both classifiers.
This paper presents results of a subjective experiment of user behaviour analysis on state-of-the-art learning management systems (LMS) and massive open online courses (MOOCs). The purpose of this study is to conduct a usability analysis... more
This paper presents results of a subjective experiment of user behaviour analysis on state-of-the-art learning management systems (LMS) and massive open online courses (MOOCs). The purpose of this study is to conduct a usability analysis on different eLearning platforms by observing subjects facial expressions, and based on the generated results to speculate which of the platforms are easy to use and work with for a new user. An experiment is designed for this purpose with different tasks that each subject has to perform, while they are being recorded. The facial recordings are analysed to find seven emotional engagement attributes and three sentiment engagement attributes using facial expression software. The results of our work show some very interesting findings. Additionally we have also proposed some recommendations based on an extensive comparison of features among different LMS that will provide better content personalization and customization, thereby improving learning outcome.
This paper presents results of a subjective experiment of user behaviour analysis on state-of-the-art learning management systems (LMS) and massive open online courses (MOOCs). The purpose of this study is to conduct a usability analysis... more
This paper presents results of a subjective experiment of user behaviour analysis on state-of-the-art learning management systems (LMS) and massive open online courses (MOOCs). The purpose of this study is to conduct a usability analysis on different eLearning platforms by observing subjects facial expressions, and based on the generated results to speculate which of the platforms are easy to use and work with for a new user. An experiment is designed for this purpose with different tasks that each subject has to perform, while they are being recorded. The facial recordings are analysed to find seven emotional engagement attributes and three sentiment engagement attributes using facial expression software. The results of our work show some very interesting findings. Additionally we have also proposed some recommendations based on an extensive comparison of features among different LMS that will provide better content personalization and customization, thereby improving learning outc...
Research Interests:
With the emerging technologies of augmented reality (AR) and virtual reality (VR), the learning process in today’s classroom is much more effective and motivational. Overlaying virtual content into the real world makes learning methods... more
With the emerging technologies of augmented reality (AR) and virtual reality (VR), the learning process in today’s classroom is much more effective and motivational. Overlaying virtual content into the real world makes learning methods attractive and entertaining for students while performing activities. AR techniques make the learning process easy, and fun as compared to traditional methods. These methods lack focused learning and interactivity between the educational content. To make learning effective, we propose to use handheld marker-based AR technology for primary school students. We developed a set of four applications based on students’ academic course of primary school level for learning purposes of the English alphabet, decimal numbers, animals and birds, and an AR Globe for knowing about different countries around the world. These applications can be played wherever and whenever a user wants without Internet connectivity, subject to the availability of a tablet or mobile ...
Skin cancer is a widespread disease associated with eight diagnostic classes. The diagnosis of multiple types of skin cancer is a challenging task for dermatologists due to the similarity of skin cancer classes in phenotype. The average... more
Skin cancer is a widespread disease associated with eight diagnostic classes. The diagnosis of multiple types of skin cancer is a challenging task for dermatologists due to the similarity of skin cancer classes in phenotype. The average accuracy of multiclass skin cancer diagnosis is 62% to 80%. Therefore, the classification of skin cancer using machine learning can be beneficial in the diagnosis and treatment of the patients. Several researchers developed skin cancer classification models for binary class but could not extend the research to multiclass classification with better performance ratios. We have developed deep learning-based ensemble classification models for multiclass skin cancer classification. Experimental results proved that the individual deep learners perform better for skin cancer classification, but still the development of ensemble is a meaningful approach since it enhances the classification accuracy. Results show that the accuracy of individual learners of Re...
Deep neural networks have emerged as a leading approach towards handling many natural language processing (NLP) tasks. Deep networks initially conquered the problems of computer vision. However, dealing with sequential data such as text... more
Deep neural networks have emerged as a leading approach towards handling many natural language processing (NLP) tasks. Deep networks initially conquered the problems of computer vision. However, dealing with sequential data such as text and sound was a nightmare for such networks as traditional deep networks are not reliable in preserving contextual information. This may not harm the results in the case of image processing where we do not care about the sequence, but when we consider the data collected from text for processing, such networks may trigger disastrous results. Moreover, establishing sentence semantics in a colloquial text such as Roman Urdu is a challenge. Additionally, the sparsity and high dimensionality of data in such informal text have encountered a significant challenge for building sentence semantics. To overcome this problem, we propose a deep recurrent architecture RU-BiLSTM based on bidirectional LSTM (BiLSTM) coupled with word embedding and an attention mecha...
This dataset is comprised of word embeddings and document topic distribution vectors generated from transcripts of 12032 video lectures from 200 courses that were collected from Coursera learning platform. Two well-known natural language... more
This dataset is comprised of word embeddings and document topic distribution vectors generated from transcripts of 12032 video lectures from 200 courses that were collected from Coursera learning platform. Two well-known natural language processing techniques, namely Word2Vec and Latent Dirichlet Allocation (LDA) implemented in the Gensim package in Python are used to generate word embeddings and topic vectors, respectively.
This paper presents a novel method for data hiding based on neighborhood pixels information to calculate the number of bits that can be used for substitution and modified Least Significant Bits technique for data embedding. The modified... more
This paper presents a novel method for data hiding based on neighborhood pixels information to calculate the number of bits that can be used for substitution and modified Least Significant Bits technique for data embedding. The modified solution is independent of the nature of the data to be hidden and gives correct results along with un-noticeable image degradation. The technique, to find the number of bits that can be used for data hiding, uses the green component of the image as it is less sensitive to human eye and thus it is totally impossible for human eye to predict whether the image is encrypted or not. The application further encrypts the data using a custom designed algorithm before embedding bits into image for further security. The overall process consists of three main modules namely embedding, encryption and extraction cm.
This paper provides a comparative performance analysis of both shallow and deep machine learning classifiers for speech recognition task using frame-level phoneme classification. Phoneme recognition is still a fundamental and equally... more
This paper provides a comparative performance analysis of both shallow and deep machine learning classifiers for speech recognition task using frame-level phoneme classification. Phoneme recognition is still a fundamental and equally crucial initial step toward automatic speech recognition (ASR) systems. Often conventional classifiers perform exceptionally well on domain-specific ASR systems having a limited set of vocabulary and training data in contrast to deep learning approaches. It is thus imperative to evaluate performance of a system using deep artificial networks in terms of correctly recognizing atomic speech units, i.e., phonemes in this case with conventional state-of-the-art machine learning classifiers. Two deep learning models - DNN and LSTM with multiple configuration architectures by varying the number of layers and the number of neurons in each layer on the OLLO speech corpora along with six shallow machine learning classifiers for Filterbank acoustic features are thoroughly studied. Additionally, features with three and ten frames temporal context are computed and compared with no-context features for different models. The classifier's performance is evaluated in terms of precision, recall, and F1 score for 14 consonants and 10 vowels classes for 10 speakers with 4 different dialects. High classification accuracy of 93% and 95% F1 score is obtained with DNN and LSTM networks respectively on context-dependent features for 3-hidden layers containing 1024 nodes each. SVM surprisingly obtained even a higher classification score of 96.13% and a misclassification error of less than 5% for consonants and 4% for vowels.
This paper proposes a two-stage deep feed-forward neural network (DNN) to tackle the acoustic-to-articulatory inversion (AAI) problem. DNNs are a viable solution for the AAI task, but the temporal continuity of the estimated articulatory... more
This paper proposes a two-stage deep feed-forward neural network (DNN) to tackle the acoustic-to-articulatory inversion (AAI) problem. DNNs are a viable solution for the AAI task, but the temporal continuity of the estimated articulatory values has not been exploited properly when a DNN is employed. In this work, we propose to address the lack of any temporal constraints while enforcing a parameter-parsimonious solution by deploying a two-stage solution based only on DNNs: (i) Articulatory trajectories are estimated in a first stage using DNN, and (ii) a temporal window of the estimated trajectories is used in a follow-up DNN stage as a refinement. The first stage estimation could be thought of as an auxiliary additional information that poses some constraints on the inversion process. Experimental evidence demonstrates an average error reduction of 7.51% in terms of RMSE compared to the baseline, and an improvement of 2.39% with respect to Pearson correlation is also attained. Fina...
This article presents a dataset of tweets in the Urdu language. There are 1,140,824 tweets in the dataset, collected from Twitter for September and October 2020. This large-scale corpus of tweets is generated by performing pre-processing... more
This article presents a dataset of tweets in the Urdu language. There are 1,140,824 tweets in the dataset, collected from Twitter for September and October 2020. This large-scale corpus of tweets is generated by performing pre-processing which includes removing columns containing user information, retweet’s count, followers information, duplicate tweets, removing unnecessary punctuation, links, symbols, and spaces, and finally extracting emojis if present in the tweet text. In the final dataset each tweet record contains columns for tweet id, text, and emoji extracted from the text with a sentiment score. Emojis are extracted to validate Machine Learning models used for the multilingual sentiment and behavior analysis. These are extracted using a Python script that searches for an emoji from the list of 751 most frequently used emojis. If an emoji is present in the text, a column with the emoji description and sentiment score is added.
A multi-view multi-target correspondence framework employing deep learning on overlapping cameras for identity-aware tracking in the presence of occlusion is proposed. Our complete pipeline of detection, multi-view correspondence, fusion... more
A multi-view multi-target correspondence framework employing deep learning on overlapping cameras for identity-aware tracking in the presence of occlusion is proposed. Our complete pipeline of detection, multi-view correspondence, fusion and tracking, inspired by AI greatly improves person correspondence across multiple wide-angled views over traditionally used features set and handcrafted descriptors. We transfer the learning of a deep convolutional neural net (CNN) trained to jointly learn pedestrian features and similarity measures, to establish identity correspondence of non-occluding targets across multiple overlapping cameras with varying illumination and human pose. Subsequently, the identity-aware foreground principal axes of visible targets in each view are fused onto top view without requirement of camera calibration and precise principal axes length information. The problem of ground point localisation of targets on top view is then solved via linear programming for optim...
This paper presents a study evaluating different acoustic feature map representations in two-dimensional convolutional neural networks (2D-CNN) on the speech dataset for various speech-related activities. Specifically, the task involves... more
This paper presents a study evaluating different acoustic feature map representations in two-dimensional convolutional neural networks (2D-CNN) on the speech dataset for various speech-related activities. Specifically, the task involves identifying useful 2D-CNN input feature maps for enhancing speaker identification with an ultimate goal to improve speaker authentication and enabling voice as a biometric feature. Voice in contrast to fingerprints and image-based biometrics is a natural choice for hands-free communication systems where touch interfaces are inconvenient or dangerous to use. Effective input feature map representation may help CNN exploit intrinsic voice features that not only can address the instability issues of voice as an identifier for textindependent speaker authentication while preserving privacy but can also assist in developing efficacious voice-enabled interfaces. Three different acoustic features with three possible feature map representations are evaluated in this study. Results obtained on three speech corpora shows that an interpolated baseline spectrogram performs best compared to Mel frequency spectral coefficients (MFSC) and Mel frequency cepstral coefficient (MFCC) when tested on a 5-fold cross-validation method using 2D-CNN. On both textdependent and text-independent datasets, raw spectrogram accuracy is 4% better than the traditional acoustic features.
Currently, the use of biosensor-enabled mobile healthcare workflow applications in mobile edge-cloud-enabled systems is increasing progressively. These applications are heavyweight and divided between a thin client mobile device and a... more
Currently, the use of biosensor-enabled mobile healthcare workflow applications in mobile edge-cloud-enabled systems is increasing progressively. These applications are heavyweight and divided between a thin client mobile device and a thick server edge cloud for execution. Application partitioning is a mechanism in which applications are divided based on resource and energy parameters. However, existing application-partitioning schemes widely ignore security aspects for healthcare applications. This study devises a dynamic application-partitioning workload task-scheduling-secure (DAPWTS) algorithm framework that consists of different schemes, such as min-cut algorithm, searching node, energy-enabled scheduling, failure scheduling, and security schemes. The goal is to minimize the energy consumption of nodes and divide the application between local nodes and edge nodes by applying the secure min-cut algorithm. Furthermore, the study devises the secure-min-cut algorithm, which aims to...
A new form of distance and blended education has hit the market in recent years with the advent of massive open online courses (MOOCs) which have brought many opportunities to the educational sector. Consequently, the availability of... more
A new form of distance and blended education has hit the market in recent years with the advent of massive open online courses (MOOCs) which have brought many opportunities to the educational sector. Consequently, the availability of learning content to vast demographics of people and across locations has opened up a plethora of possibilities for everyone to gain new knowledge through MOOCs. This poses an immense issue to the content providers as the amount of manual effort required to structure properly and to organize the content automatically for millions of video lectures daily become incredibly challenging. This paper, therefore, addresses this issue as a small part of our proposed personalized content management system by exploiting the voice pattern of the lecturer for identification and for classifying video lectures to the right speaker category. The use of Mel frequency Cepstral coefficients (MFCC) as 2D input features maps to 2D-CNN has shown promising results in contrast to machine learning and deep learning classifiers - making text-independent speaker identification plausible in MOOC setting for automatic video lecture categorization. It will not only help categorize educational videos efficiently for easy search and retrieval but will also promote effective utilization of micro-lectures and multimedia video learning objects (MLO).
Data imbalance is a frequently occurring problem in classification tasks where the number of samples in one category exceeds the amount in others. Quite often, the minority class data is of great importance representing concepts of... more
Data imbalance is a frequently occurring problem in classification tasks where the number of samples in one category exceeds the amount in others. Quite often, the minority class data is of great importance representing concepts of interest and is often challenging to obtain in real-life scenarios and applications. Imagine a customers’ dataset for bank loans-majority of the instances belong to non-defaulter class, only a small number of customers would be labeled as defaulters, however, the performance accuracy is more important on defaulters labels than non-defaulter in such highly imbalance datasets. Lack of enough data samples across all the class labels results in data imbalance causing poor classification performance while training the model. Synthetic data generation and oversampling techniques such as SMOTE, AdaSyn can address this issue for statistical data, yet such methods suffer from overfitting and substantial noise. While such techniques have proved useful for synthetic...

And 40 more