Proceedings of the ICiCSE2017 (International Conference on Innovation in Computer Science and Engineering (ICiCSE 2017))
Objectives: The objective of this paper is to classify multimodal human actions of the Berkeley M... more Objectives: The objective of this paper is to classify multimodal human actions of the Berkeley Multimodal Human Action Database (MHAD). Methods/Statistical analysis: Actions from accelerometer and motion capture modals are utilized in this study. Features extracted include statistical measures such as minimum, maximum, mean, median, standard deviation, kurtosis and skewness. Feature extraction level fusion is applied to form a feature vector comprising two modalities. Feature selection is implemented using Particle Swarm Optimization (PSO), Tabu, and Ranker. Classification is performed with Support Vector Machine (SVM), Random Forest (RF), k-Nearest Neighbour (k-NN) and Best First Tree (BFT). Findings: The classification model that gave the highest accuracy is Support Vector Machine with Radial Basis Function kernel with a correct classification rate (CCR) of 97.6 % for the accelerometer modal (Acc), 99.8% for the motion capture system modal (Mocap), and 99.8% for the fusion modal (FusioMA). In the feature selection process, Ranker selected every single extracted feature (162 features for Acc and 1161 features for Mocap and 1323 features for FusioMA) and produced an average CCR of 97.4%. Comparing with PSO (68 features for Acc, 350 features for Mocap and 412 features for FusioMA), it produced an average CCR of 97.1% and Tabu (54 features for Acc, 199 features for Mocap and 323 features for FusionMA) produced an average CCR of 97.2%. Although Ranker gave the best result, the difference in the average CCR is not significant. Thus, PSO and Tabu may be more suitable in this case as the reduced feature set can result in computational speedup and reduced complexity. Application/Improvements: The extracted statistical features are able to produce high accuracy in classification of multimodal human actions. The feature extraction level fusion to combine the two modalities performs better than single modality in the classification.
This work aims to develop an information retrieval application based on augmented reality (AR) te... more This work aims to develop an information retrieval application based on augmented reality (AR) technologies to enhance visitors' experience in a museum exhibition. The purpose of developing this application is to give visitors of museums a customized interactive experience through a handheld smartphone. The application recognizes objects of interest and retrieve information of such objects for display through feeds from a smartphone's camera in real time and overlays the information over the object. This is achieved with vision-based AR, utilizing 3D object tracking, thus eliminating the use of markers, which could prove unreliable due to obfuscation or damage.
This paper describes the baseline corpus of a new multimodal biometric database, the MMU GASPFA (... more This paper describes the baseline corpus of a new multimodal biometric database, the MMU GASPFA (Gait-Speech-Face) database. The corpus in GASPFA is acquired using commercial off the shelf (COTS) equipment including digital video cameras, digital voice recorder, digital camera, Kinect camera and accelerometer equipped smart phones. The corpus consists of frontal face images from the digital camera, speech utterances recorded using the digital voice recorder, gait videos with their associated data recorded using both the digital video cameras and Kinect camera simultaneously as well as accelerometer readings from the smart phones. A total of 82 participants had their biometric data recorded. MMU GASPFA is able to support both multimodal biometric authentication as well as gait action recognition. This paper describes the acquisition setup and protocols used in MMU GASPFA, as well as the content of the corpus. Baseline results from a subset of the participants are presented for validation purposes.
This paper presents the segmentation of the CT head images with different techniques. The system ... more This paper presents the segmentation of the CT head images with different techniques. The system partitions the CT head images into four regions which are skull, calcifications, cerebrospinal fluid (CSF) and brain matter. The method consists of two phases. The first phase is to partition the skull, CSF and brain matter in which we applied the expectation- maximization (EM) algorithm for the segmentation. The second phase is to identify the calcifications where we used thresholding. The system has been tested with a number of real CT head images and has achieved promising results.
ABSTRACT This paper describes the acquisition setup and development of a new gait database, MMUGa... more ABSTRACT This paper describes the acquisition setup and development of a new gait database, MMUGait DB. The database was captured in side and oblique views, where 82 subjects participated under normal walking conditions and 19 subjects walking under 11 covariate factors. The database includes sarong and kain samping as changes of apparel, which are the traditional costumes for ethnic Malays in South East Asia. Classification experiments were carried out on MMUGait DB and the baseline results are presented for validation purposes
In this paper, we propose a methodology consists of several unsupervised clustering techniques to... more In this paper, we propose a methodology consists of several unsupervised clustering techniques to acquire a satisfactory segmentation of computed tomography (CT) brain images. The ultimate goal of segmentation is to obtain three segmented images, which are the abnormalities, cerebrospinal fluid (CSF) and brain matter respectively. The proposed approach contains of two phase-segmentation methods. In the first phase segmentation, the combination of k-means and fuzzy c-means (FCM) methods is implemented to partition the images into the binary images. From the binary images, a decision tree is then utilized to annotate the connected component into normal and abnormal regions. For the second phase segmentation, the obtained experimental results have shown that modified FCM with population-diameter independent(PDI) segmentation is more feasible and yield satisfactory results.
Proceedings of the ICiCSE2017 (International Conference on Innovation in Computer Science and Engineering (ICiCSE 2017))
Objectives: The objective of this paper is to classify multimodal human actions of the Berkeley M... more Objectives: The objective of this paper is to classify multimodal human actions of the Berkeley Multimodal Human Action Database (MHAD). Methods/Statistical analysis: Actions from accelerometer and motion capture modals are utilized in this study. Features extracted include statistical measures such as minimum, maximum, mean, median, standard deviation, kurtosis and skewness. Feature extraction level fusion is applied to form a feature vector comprising two modalities. Feature selection is implemented using Particle Swarm Optimization (PSO), Tabu, and Ranker. Classification is performed with Support Vector Machine (SVM), Random Forest (RF), k-Nearest Neighbour (k-NN) and Best First Tree (BFT). Findings: The classification model that gave the highest accuracy is Support Vector Machine with Radial Basis Function kernel with a correct classification rate (CCR) of 97.6 % for the accelerometer modal (Acc), 99.8% for the motion capture system modal (Mocap), and 99.8% for the fusion modal (FusioMA). In the feature selection process, Ranker selected every single extracted feature (162 features for Acc and 1161 features for Mocap and 1323 features for FusioMA) and produced an average CCR of 97.4%. Comparing with PSO (68 features for Acc, 350 features for Mocap and 412 features for FusioMA), it produced an average CCR of 97.1% and Tabu (54 features for Acc, 199 features for Mocap and 323 features for FusionMA) produced an average CCR of 97.2%. Although Ranker gave the best result, the difference in the average CCR is not significant. Thus, PSO and Tabu may be more suitable in this case as the reduced feature set can result in computational speedup and reduced complexity. Application/Improvements: The extracted statistical features are able to produce high accuracy in classification of multimodal human actions. The feature extraction level fusion to combine the two modalities performs better than single modality in the classification.
This work aims to develop an information retrieval application based on augmented reality (AR) te... more This work aims to develop an information retrieval application based on augmented reality (AR) technologies to enhance visitors' experience in a museum exhibition. The purpose of developing this application is to give visitors of museums a customized interactive experience through a handheld smartphone. The application recognizes objects of interest and retrieve information of such objects for display through feeds from a smartphone's camera in real time and overlays the information over the object. This is achieved with vision-based AR, utilizing 3D object tracking, thus eliminating the use of markers, which could prove unreliable due to obfuscation or damage.
This paper describes the baseline corpus of a new multimodal biometric database, the MMU GASPFA (... more This paper describes the baseline corpus of a new multimodal biometric database, the MMU GASPFA (Gait-Speech-Face) database. The corpus in GASPFA is acquired using commercial off the shelf (COTS) equipment including digital video cameras, digital voice recorder, digital camera, Kinect camera and accelerometer equipped smart phones. The corpus consists of frontal face images from the digital camera, speech utterances recorded using the digital voice recorder, gait videos with their associated data recorded using both the digital video cameras and Kinect camera simultaneously as well as accelerometer readings from the smart phones. A total of 82 participants had their biometric data recorded. MMU GASPFA is able to support both multimodal biometric authentication as well as gait action recognition. This paper describes the acquisition setup and protocols used in MMU GASPFA, as well as the content of the corpus. Baseline results from a subset of the participants are presented for validation purposes.
This paper presents the segmentation of the CT head images with different techniques. The system ... more This paper presents the segmentation of the CT head images with different techniques. The system partitions the CT head images into four regions which are skull, calcifications, cerebrospinal fluid (CSF) and brain matter. The method consists of two phases. The first phase is to partition the skull, CSF and brain matter in which we applied the expectation- maximization (EM) algorithm for the segmentation. The second phase is to identify the calcifications where we used thresholding. The system has been tested with a number of real CT head images and has achieved promising results.
ABSTRACT This paper describes the acquisition setup and development of a new gait database, MMUGa... more ABSTRACT This paper describes the acquisition setup and development of a new gait database, MMUGait DB. The database was captured in side and oblique views, where 82 subjects participated under normal walking conditions and 19 subjects walking under 11 covariate factors. The database includes sarong and kain samping as changes of apparel, which are the traditional costumes for ethnic Malays in South East Asia. Classification experiments were carried out on MMUGait DB and the baseline results are presented for validation purposes
In this paper, we propose a methodology consists of several unsupervised clustering techniques to... more In this paper, we propose a methodology consists of several unsupervised clustering techniques to acquire a satisfactory segmentation of computed tomography (CT) brain images. The ultimate goal of segmentation is to obtain three segmented images, which are the abnormalities, cerebrospinal fluid (CSF) and brain matter respectively. The proposed approach contains of two phase-segmentation methods. In the first phase segmentation, the combination of k-means and fuzzy c-means (FCM) methods is implemented to partition the images into the binary images. From the binary images, a decision tree is then utilized to annotate the connected component into normal and abnormal regions. For the second phase segmentation, the obtained experimental results have shown that modified FCM with population-diameter independent(PDI) segmentation is more feasible and yield satisfactory results.
Uploads
Papers