Samira Sadaoui
University of Regina, Computer Science, Faculty Member
Our study explores offensive and hate speech detection for the Arabic language, as previous studies are minimal. Based on two-class, three-class, and six-class Arabic-Twitter datasets, we develop single and ensemble CNN and BiLSTM... more
Our study explores offensive and hate speech detection for the Arabic language, as previous studies are minimal. Based on two-class, three-class, and six-class Arabic-Twitter datasets, we develop single and ensemble CNN and BiLSTM classifiers that we train with non-contextual (Fasttext-SkipGram) and contextual (Multilingual Bert and AraBert) word-embedding models. For each hate/offensive classification task, we conduct a battery of experiments to evaluate the performance of single and ensemble classifiers on testing datasets. The average-based ensemble approach was found to be the best performing, as it returned F-scores of 91%, 84%, and 80% for two-class, three-class and six-class prediction tasks, respectively. We also perform an error analysis of the best ensemble model for each task.
Research Interests:
Constraint optimization consists of looking for an optimal solution maximizing a given objective function while meeting a set of constraints. In this study, we propose a new algorithm based on mushroom reproduction for solving constraint... more
Constraint optimization consists of looking for an optimal solution maximizing a given objective function while meeting a set of constraints. In this study, we propose a new algorithm based on mushroom reproduction for solving constraint optimization problems. Our algorithm, that we call Mushroom Reproduction Optimization (MRO), is inspired by the natural reproduction and growth mechanisms of mushrooms. This process includes the discovery of rich areas with good living conditions allowing spores to grow and develop their own colonies. Given that constraint optimization problems often su®er from a high-time computation cost, we thoroughly assess MRO performance on well-known constrained engineering and real-world problems. The experimental results con¯rm the high performance of MRO, comparing to other known meta-heursitcs, in dealing with complex optimization problems.
Research Interests:
Shill Bidding (SB) is still a predominant auction fraud because it is the toughest to identify due to its resemblance to the standard bidding behavior. To reduce losses on the buyers' side, we devise an example-incremental classification... more
Shill Bidding (SB) is still a predominant auction fraud because it is the toughest to identify due to its resemblance to the standard bidding behavior. To reduce losses on the buyers' side, we devise an example-incremental classification model that can detect fraudsters from incoming auction transactions. Thousands of auctions occur every day in a commercial site, and to process the continuous rapid data flow, we introduce a chunk-based incremental classification algorithm, which also tackles the imbalanced and non-linear learning issues. We train the algorithm incrementally with several training SB chunks and concurrently assess the performance and speed of the new learned models using unseen SB chunks.
Research Interests:
We present a new nature-inspired approach based on the Focus Group Optimization Algorithm (FGOA) for solving Constraint Satisfaction Problems (CSPs). CSPs are NP-complete problems meaning that solving them by classical systematic search... more
We present a new nature-inspired approach based on the Focus Group Optimization Algorithm (FGOA) for solving Constraint Satisfaction Problems (CSPs). CSPs are NP-complete problems meaning that solving them by classical systematic search methods requires exponential time, in theory. Appropriate alternatives are approximation methods such as metaheuristic algorithms which have shown successful results when solving combinatorial problems. FGOA is a new metaheuristic inspired by a human collaborative problem solving approach. In this paper, the steps of applying FGOA to CSPs are elaborated. More precisely, a new diversification method is devised to enable the algorithm to efficiently find solutions to CSPs, by escaping local optimum. To assess the performance of the proposed Discrete FGOA (DFGOA) in practice, we conducted several experiments on randomly generate hard to solve CSP instances (those near the phase transition) using the RB model. The results clearly show the ability of DFGOA to successfully find the solutions to these problems in very reasonable amount of time.
Research Interests:
Shill Bidding (SB) is a serious auction fraud committed by clever scammers. The challenge in labeling multi-dimensional SB training data hinders research on SB classification. To safeguard individuals from shill bidders , in this study,... more
Shill Bidding (SB) is a serious auction fraud committed by clever scammers. The challenge in labeling multi-dimensional SB training data hinders research on SB classification. To safeguard individuals from shill bidders , in this study, we explore Semi-Supervised Classification (SSC), which is the most suitable method for our fraud detection problem since SSC can learn efficiently from a few labeled data. To label a portion of SB data, we propose an anomaly detection method that we combine with hierarchical clustering. We carry out several experiments to determine statistically the minimal sufficient amount of labeled data required to achieve the highest accuracy. We also investigate the misclassified bidders to see where the misclassification occurs. The empirical analysis demonstrates that SSC reduces the laborious effort of labeling SB data.
Research Interests:
With the growing recognition of the importance of Project Management (PM), new solutions are still researched to improve PM practices in environments where there is a restriction on the project types. PM is becoming more widespread in... more
With the growing recognition of the importance of Project Management (PM), new solutions are still researched to improve PM practices in environments where there is a restriction on the project types. PM is becoming more widespread in business and academia but without enough information about the course of actions to be taken to archive success. This is principally true in the IT sector where the impact of new technologies is felt faster than in any other areas. This present study reviews the actual state of IT project management based on an online survey that we conducted with worldwide companies. Our aim is to provide insights and recommendations on how to increase the projects' success rate based on the results of the survey analysis.
Research Interests:
Constraint Satisfaction Problems (CSPs) are known NP-complete problems requiring systematic search methods of exponential time costs for solving them. To overcome this limitation, an alternative is to use metaheuristics. However, these... more
Constraint Satisfaction Problems (CSPs) are known NP-complete problems requiring systematic search methods of exponential time costs for solving them. To overcome this limitation, an alternative is to use metaheuristics. However, these techniques often suffer from immature convergence, and this is mainly due to a lack of adequate diversity of the potential solutions. To address this challenge, we update the Discrete Firefly Algorithm (DFA) with the Chaos Theory. We call the Chaotic Discrete Firefly Algorithm (CDFA) this proposed algorithm. To assess the performance in practice of the proposed CDFA, we conducted several experiments on CSP instances randomly generated based on the model RB. The results of the experiments demonstrate the efficiency of CDFA in dealing with CSPs.
Research Interests:
Shill Bidding (SB) has been recognized as the predominant online auction fraud and also the most difficult to detect due to its similarity to normal bidding behavior. Previously, we produced a high-quality SB dataset based on actual... more
Shill Bidding (SB) has been recognized as the predominant online auction fraud and also the most difficult to detect due to its similarity to normal bidding behavior. Previously, we produced a high-quality SB dataset based on actual auctions and effectively labeled the instances into normal or suspicious. To overcome the serious problem of imbalanced SB datasets, in this study, we investigate over-and under-sampling techniques through several instance-based classification algorithms. Thousands of auctions occur in eBay every day, and auction data may be sent continuously to the optimal fraud classifier to detect potential SB activities. Consequently , instance-based classification is appropriate for our particular fraud detection problem. According to the experimental results, incremental classification returns high performance for both over-and under-sampled SB datasets. Still, over-sampling slightly outperforms under-sampling for both normal and suspicious classes across all the classifiers.
Research Interests:
E-auctions have attracted serious fraud, such as Shill Bidding (SB), due to the large amount of money involved and anonymity of users. SB is difficult to detect given its similarity to normal bidding behavior. To this end, we develop an... more
E-auctions have attracted serious fraud, such as Shill Bidding (SB), due to the large amount of money involved and anonymity of users. SB is difficult to detect given its similarity to normal bidding behavior. To this end, we develop an efficient SVM-based fraud classifier that enables auction companies to distinguish between legitimate and shill bidders. We introduce a robust approach to build offline the optimal SB classifier. To produce SB training data, we combine the hierarchical clustering and our own labelling strategy, and then utilize a hybrid data sampling method to solve the issue of highly imbalanced SB datasets. To avert financial loss in new auctions, the SB classifier is to be launched at the end of the bidding period and before auction finalization. Based on commercial auction data, we conduct experiments for offline and online SB detection. The classification results exhibit good detection accuracy and mis-classification rate of shill bidders.
Research Interests:
In the last three decades, we have seen a significant increase in trading goods and services through online auctions. However, this business created an attractive environment for malicious moneymakers who can commit different types of... more
In the last three decades, we have seen a significant increase in trading goods and services through online auctions. However, this business created an attractive environment for malicious moneymakers who can commit different types of fraud activities, such as Shill Bidding (SB). The latter is predominant across many auctions but this type of fraud is difficult to detect due to its similarity to normal bidding behaviour. The unavailability of SB datasets makes the development of SB detection and classification models burdensome. Furthermore, to implement efficient SB detection models, we should produce SB data from actual auctions of commercial sites. In this study, we first scraped a large number of eBay auctions of a popular product. After preprocessing the raw auction data, we build a high quality SB dataset based on the most reliable SB strategies. The aim of our research is to share the preprocessed auction dataset as well as the SB training (unlabelled) dataset, thereby researchers can apply various machine learning techniques by using authentic data of auctions and fraud.
Research Interests:
The option of organizing E-auctions to purchase electricity required for anticipated peak load period is a new one for utility companies. To meet the extra demand load, we develop electricity combinatorial reverse auction (CRA) for the... more
The option of organizing E-auctions to purchase electricity required for anticipated peak load period is a new one for utility companies. To meet the extra demand load, we develop electricity combinatorial reverse auction (CRA) for the purpose of procuring power from diverse energy sources. In this new, smart electricity market, suppliers of different scales can participate, and home-owners may even take an active role. In our CRA, an item, which is subject to several trading constraints, denotes a time slot that has two conflicting attributes, electricity quantity and price. To secure electricity, we design our auction with two bidding rounds: round one is exclusively for variable energy, and round two allows storage and non-intermittent renewable energy to bid on the remaining items. Our electricity auction leads to a complex winner determination (WD) task that we represent as a resource procurement optimization problem. We solve this problem using multi-objective genetic algorithms in order to find the trade-off solution that best lowers the price and increases the quantity. This solution consists of multiple winning suppliers, their prices, quantities and schedules. We validate our WD approach based on large-scale simulated datasets. We first assess the time-efficiency of our WD method, and we then compare it to well-known heuristic and exact WD techniques. In order to gain an exact idea about the accuracy of WD, we implement two famous exact algorithms for our constrained combinatorial procurement problem.
Research Interests:
This study introduces an advanced Combinatorial Reverse Auction (CRA), multi-units, multiattributes and multi-objective, which is subject to buyer and seller trading constraints. Conflicting objectives may occur since the buyer can... more
This study introduces an advanced Combinatorial Reverse Auction (CRA), multi-units, multiattributes and multi-objective, which is subject to buyer and seller trading constraints. Conflicting objectives may occur since the buyer can maximize some attributes and minimize some others. To address the Winner Determination (WD) problem for this type of CRAs, we propose an optimization approach based on genetic algorithms that we integrate with our variants of diversity and elitism strategies to improve the solution quality. Moreover, by maximizing the buyer’s revenue, our approach is able to return the best solution for our complex WD problem. We conduct a case study as well as simulated testing to illustrate the importance of the diversity and elitism schemes. We also validate the proposed WD method through simulated experiments by generating large instances of our CRA problem. The experimental results demonstrate on one hand the performance of our WD method in terms of several quality measures, like solution quality, run-time complexity and trade-off between convergence and diversity, and on the other hand, it’s significant superiority to well-known heuristic and exact WD techniques that have been implemented for much simpler CRAs.
Research Interests:
Online auctioning has attracted serious fraud given the huge amount of money involved and anonymity of users. In the auction fraud detection domain, the class imbalance, which means less fraud instances are present in bidding... more
Online auctioning has attracted serious fraud given the huge amount of money involved and anonymity of users. In the auction fraud detection domain, the class imbalance, which means less fraud instances are present in bidding transactions, negatively impacts the classification performance because the latter is biased towards the majority class i.e. normal bidding behavior. The best-designed approach to handle the imbalanced learning problem is data sampling that was found to improve the classification efficiency. In this study, we utilize a hybrid method of data over-sampling and under-sampling to be more effective in addressing the issue of highly imbalanced auction fraud datasets. We deploy a set of well-known binary classifiers to understand how the class imbalance affects the classification results. We choose the most relevant performance metrics to deal with both imbalanced data and fraud bidding data.
Research Interests:
— This study introduces a new type of Combinatorial Reverse Auction (CRA), products with multi-units, multi-attributes and multi-objectives, which are subject to buyer and seller constraints. In this advanced CRA, buyers may maximize some... more
— This study introduces a new type of Combinatorial Reverse Auction (CRA), products with multi-units, multi-attributes and multi-objectives, which are subject to buyer and seller constraints. In this advanced CRA, buyers may maximize some attributes and minimize some others. To address the Winner Determination (WD) problem in the presence of multiple conflicting objectives, we propose an optimization approach based on genetic algorithms. To improve the quality of the winning solution, we incorporate our own variants of the diversity and elitism strategies. We illustrate the WD process based on a real case study. Afterwards, we validate the proposed approach through artificial datasets by generating large instances of our multi-objective CRA problem. The experimental results demonstrate on one hand the performance of our WD method in terms of three quality metrics, and on the other hand, its significant superiority to well-known heuristic and exact WD techniques that have been defined for simpler CRAs (BEST PAPER AWARD)
Research Interests:
Monitoring the progress of auctions for fraudulent bidding activities is crucial for detecting and stopping fraud during runtime to prevent fraudsters from succeeding. To this end, we introduce a stage-based framework to monitor multiple... more
Monitoring the progress of auctions for fraudulent bidding activities is crucial for detecting and stopping fraud during runtime to prevent fraudsters from succeeding. To this end, we introduce a stage-based framework to monitor multiple live auctions for In-Auction Fraud (IAF). Creating a stage fraud monitoring system is different than what has been previously proposed in the very limited studies on runtime IAF detection. More precisely, we launch the IAF monitoring operation at several time points in each running auction depending on its duration. At each auction time point, our framework first detects IAF by evaluating each bidder’s stage activities based on the most reliable set of IAF patterns, and then takes appropriate actions to react to dishonest bidders. We develop the proposed framework with a dynamic agent architecture where multiple monitoring agents can be created and deleted with respect to the status of their corresponding auctions (initialized, completed or cancelled). The adoption of dynamic software architecture represents an excellent solution to the scalability and time efficiency issues of IAF monitoring systems since hundreds of live auctions are held simultaneously in commercial auction houses. Every time an auction is completed or terminated, the participants’ fraud scores are updated dynamically. Our approach enables us to observe each bidder in each live auction and manage his fraud score as well. We validate the IAF monitoring service through commercial auction data. We conduct three experiments to detect and react to shill-bidding fraud by employing datasets acquired from auctions of two valuable items, Palm PDA and XBOX. We observe each auction at three-time points, verifying the shill patterns that most likely happen in the corresponding stage for each one.
Research Interests:
Auctioning multi-dimensional items is a key challenge, which requires rigorous tools. This study proposes a multi-round, first-score, semi-sealed multi-attribute reverse auction system. A fundamental concern in multi-attribute auctions is... more
Auctioning multi-dimensional items is a key challenge, which requires rigorous tools. This study proposes a multi-round, first-score, semi-sealed multi-attribute reverse auction system. A fundamental concern in multi-attribute auctions is acquiring a useful description of the buyers’ individuated requirements: hard constraints and qualitative preferences. To consider real requirements, we express dependencies among attributes. Indeed, our system enables buyers eliciting conditional constraints as well as conditional preferences. However, determining the winner with diverse criteria may be very time consuming. Therefore, it is more useful for our auction to process quantitative data. A challenge here is to satisfy buyers with more facilities, and at the same time keep the auctions efficient. To meet this challenge, our system maps the qualitative preferences into a multi-criteria decision rule. It also completely automates the winner determination since it is a very difficult task for buyers to estimate quantitatively the attribute weights and define attributes value functions. Our procurement auction looks for the outcome that satisfies all the constraints and best matches the preferences. We demonstrate the feasibility and measure the time performance of the proposed system through a 10-attribute auction. Finally, we assess the user acceptance of our requirements specification and winner selection tool.
Research Interests:
In spite of many advantages of online auctioning, serious frauds menace the auction users’ interests. Today, monitoring auctions for frauds is becoming very crucial. We propose here a generic framework that covers realtime monitoring of... more
In spite of many advantages of online auctioning, serious frauds menace the auction users’ interests. Today, monitoring auctions for frauds is becoming very crucial. We propose here a generic framework that covers realtime monitoring of multiple live auctions. The monitoring is performed at different auction times depending on fraud types and auction duration. We divide the real-time monitoring functionality into threefold: detecting frauds, reacting to frauds, and updating bidders’ clusters. The first task examines in run-time bidding activities in ongoing auctions by applying fraud detection mechanisms. The second one determines how to react to suspicious activities by taking appropriate run-time actions against the fraudsters and infected auctions. Finally, every time an auction ends, successfully or unsuccessfully, participants’ fraud scores and their clusters are updated dynamically. Through simulated auction data, we conduct an experiment to monitor live auctions for shill bidding. The latter is considered the most severe fraud in online auctions, and the most difficult to detect. More precisely, we monitor each live auction at three time points, and for each of them, we verify the shill patterns that most likely happen.
Research Interests:
Trust management is becoming crucial in open systems because they may contain malicious and untrustworthy service providers. Trust management in multi-agent systems (used to model open systems) has gained a huge amount of attention from... more
Trust management is becoming crucial in open systems because they may contain malicious and untrustworthy service providers. Trust management in multi-agent systems (used to model open systems) has gained a huge amount of attention from researchers in recent years. In our previous work, we proposed a generic agent trust management framework, called ScubAA, which is based on the theory of Human Plausible Reasoning. ScubAA first recommends to the trustor (e.g. a user) a personalized ranked list of the most trusted trustees (e.g. service providers), within the context of the trustor's request, and then forwards the request to those trusted trustees only. In this article, we are particularly interested in comparing, from a theoretical perspective, ScubAA with four other trust management systems that we selected from the vast literature on trust management. This comparison highlights significant factors that agent trust management systems utilize in their trust evaluation process. It also shows that ScubAA is able to consider more trust evidences towards a more accurate value of trust. Indeed, ScubAA introduces a single unified framework that considers various important aspects of trust management, such as the truster’s feedback, history of trustor’s interactions, context of the trustor's request, third-party references from trustors as well as from trustees, and the structure of the society of trustor and trustee agents.
Research Interests:
Research Interests:
Research Interests:
Research Interests:
Deep learning was adopted successfully in hate speech detection problems, but very minimal for the Arabic language. Also, the word-embedding modelséffect on the neural network's performance were not adequately examined in the literature.... more
Deep learning was adopted successfully in hate speech detection problems, but very minimal for the Arabic language. Also, the word-embedding modelséffect on the neural network's performance were not adequately examined in the literature. Through 2-class, 3-class, and 6-class classification tasks, we investigate the impact of both word-embedding models and neural network architectures on the predictive accuracy. We first train several word-embedding models on a large-scale Arabic text corpus. Next, based on a reliable dataset of Arabic hate and offensive speech, we train several neural networks for each detection task using the pre-trained word embeddings. This task yields a large number of learned models, which allows conducting an exhaustive comparison. The experiments demonstrate the superiority of the skip-gram models and CNN networks across the three detection tasks.
Research Interests:
This research explores Cost-Sensitive Learning (CSL) in the fraud detection domain to decrease the fraud class's incorrect predictions and increase its accuracy. Notably, we concentrate on shill bidding fraud that is challenging to detect... more
This research explores Cost-Sensitive Learning (CSL) in the fraud detection domain to decrease the fraud class's incorrect predictions and increase its accuracy. Notably, we concentrate on shill bidding fraud that is challenging to detect because the behavior of shill and legitimate bidders are similar. We investigate CSL within the Semi-Supervised Classification (SSC) framework to address the scarcity of labeled fraud data. Our paper is the first attempt to integrate CSL with SSC for fraud detection. We adopt a meta-CSL approach to manage the costs of mis-classification errors, while SSC algorithms are trained with imbalanced data. Using an actual shill bidding dataset, we assess the performance of several hybrid models of CSL and SSC and then compare their mis-classification error and accuracy rates statistically. The most efficient CSL+SSC model was able to detect 99% of fraudsters and with the lowest total cost.