Skip to main content

    Reza Safabakhsh

    A wide variety of deep reinforcement learning (DRL) models have recently been proposed to learn profitable investment strategies. The rules learned by these models outperform the previous strategies specially in high frequency trading... more
    A wide variety of deep reinforcement learning (DRL) models have recently been proposed to learn profitable investment strategies. The rules learned by these models outperform the previous strategies specially in high frequency trading environments. However, it is shown that the quality of the extracted features from a long-term sequence of raw prices of the instruments greatly affects the performance of the trading rules learned by these models. Employing a neural encoder-decoder structure to extract informative features from complex input time-series has proved very effective in other popular tasks like neural machine translation and video captioning in which the models face a similar problem. The encoder-decoder framework extracts highly informative features from a long sequence of prices along with learning how to generate outputs based on the extracted features. In this paper, a novel end-to-end model based on the neural encoder-decoder framework combined with DRL is proposed to...
    Non-intrusive person recognition in different poses is a crucial task in the field of human-robot interaction (HRI). Most of the previous work for person recognition were based on face. In these approaches, if the person's face is... more
    Non-intrusive person recognition in different poses is a crucial task in the field of human-robot interaction (HRI). Most of the previous work for person recognition were based on face. In these approaches, if the person's face is invisible to the robot, it will ask the person to stand in a location suitable for the robot and look at the robot to recognize him. However in a normal interaction, the robot should be able to recognize a person without user cooperation to have acceptable interaction with users in real-time. In this paper, we address this problem using a combination of face and 3D information of a person's body. We used a mathematical framework based on the Bayesian decision theory for decision making. This combination can be used for recognizing scenarios where the pose of the people is unconstrained which makes it difficult to recognize a person by applying common face recognition algorithms. We tested our proposed approach in recognizing a five person group in various poses such as sitting and circular walking. The method was applied to a service robot equipped with the Kinect sensor. The results show a mean accuracy of 40.48% over all the poses using face alone, an accuracy of 67.23% using body alone, and an accuracy of 80.23% using the combination of the face and body information.
    Multi-task learning (MTL) is a popular method in machine learning which utilizes related information of multi tasks to learn a task more efficiently and accurately. Naively, one can benefit from MTL by using a weighted linear sum of the... more
    Multi-task learning (MTL) is a popular method in machine learning which utilizes related information of multi tasks to learn a task more efficiently and accurately. Naively, one can benefit from MTL by using a weighted linear sum of the different tasks loss functions. Manual specification of appropriate weights is difficult and typically does not improve performance, so it is critical to find an automatic weighting strategy for MTL. Also, there are three types of uncertainties that are captured in deep learning. Epistemic uncertainty is related to the lack of data. Heteroscedas- tic aleatoric uncertainty depends on the input data and differs from one input to another. In this paper, we focus on the third type, homoscedastic aleatoric uncertainty, which is constant for differ- ent inputs and is task-dependent. There are some methods for learning uncertainty-based weights as the parameters of a model. But in this paper, we introduce a novel multi-task loss function to capture homosced...
    ABSTRACT
    ... for bi-level thresholding, the optimal threshold, t*, is chosen so that the between-class entropy f1(t) is maximized [1]; that is, * 1 0 1 { ()} t L t Arg Max ft ≤ ≤ − = (3) and verified that ... 1 1 * * * 1 2 1 1 1 2 1 0 ,..., 1 { ,... more
    ... for bi-level thresholding, the optimal threshold, t*, is chosen so that the between-class entropy f1(t) is maximized [1]; that is, * 1 0 1 { ()} t L t Arg Max ft ≤ ≤ − = (3) and verified that ... 1 1 * * * 1 2 1 1 1 2 1 0 ,..., 1 { , ,.. M M M M t t L tt t Arg Max f tt t − − − − ≤ 5) ...
    ... Segmentation based on Nucleus Fatemeh Zamani Reza Safabakhsh ... Applications of Computer Vision, pp.76-81, 1998. [3] SF Bikhet, AMDarwish, HATolba and SIShaheen, " Segmentation and Clustering of WBC... more
    ... Segmentation based on Nucleus Fatemeh Zamani Reza Safabakhsh ... Applications of Computer Vision, pp.76-81, 1998. [3] SF Bikhet, AMDarwish, HATolba and SIShaheen, " Segmentation and Clustering of WBC ", IEEE Conf. ...
    Generating textual descriptions for images has been an attractive problem for the computer vision and natural language processing researchers in recent years. Dozens of models based on deep learning have been proposed to solve this... more
    Generating textual descriptions for images has been an attractive problem for the computer vision and natural language processing researchers in recent years. Dozens of models based on deep learning have been proposed to solve this problem. The existing approaches are based on neural encoder-decoder structures equipped with the attention mechanism. These methods strive to train decoders to minimize the log likelihood of the next word in a sentence given the previous ones, which results in the sparsity of the output space. In this work, we propose a new approach to train decoders to regress the word embedding of the next word with respect to the previous ones instead of minimizing the log likelihood. The proposed method is able to learn and extract long-term information and can generate longer fine-grained captions without introducing any external memory cell. Furthermore, decoders trained by the proposed technique can take the importance of the generated words into consideration whi...
    In many image analysis systems, segmentation is the first step. So its accuracy impacts on whole system efficiency. In normal human blood microscopic image, which contains white and red blood cells, because of high accumulation of red... more
    In many image analysis systems, segmentation is the first step. So its accuracy impacts on whole system efficiency. In normal human blood microscopic image, which contains white and red blood cells, because of high accumulation of red cells, there exist touch and overlap between these cells. They are two difficult issues in image segmentation which common segmentation algorithms cannot overcome
    The neural encoder-decoder framework has advanced the state-of-the-art in machine translation significantly. Many researchers in recent years have employed the encoder-decoder based models to solve sophisticated tasks such as image/video... more
    The neural encoder-decoder framework has advanced the state-of-the-art in machine translation significantly. Many researchers in recent years have employed the encoder-decoder based models to solve sophisticated tasks such as image/video captioning, textual/visual question answering, and text summarization. In this work we study the baseline encoder-decoder framework in machine translation and take a brief look at the encoder structures proposed to cope with the difficulties of feature extraction. Furthermore, an empirical study of solutions to enable decoders to generate richer fine-grained output sentences is provided. Finally, the attention mechanism which is a technique to cope with long-term dependencies and to improve the encoder-decoder performance on sophisticated tasks is studied.
    In our previous work, we proposed DNPSO (Dynamic Niching Particle Swarm Optimizer) and evaluated its performance in static multi-modal functions. DNPSO is a multiswarm technique in which the main swarm is divided into some sub-swarms and... more
    In our previous work, we proposed DNPSO (Dynamic Niching Particle Swarm Optimizer) and evaluated its performance in static multi-modal functions. DNPSO is a multiswarm technique in which the main swarm is divided into some sub-swarms and a number of free particles. Due to the dynamic creation and deletion of sub-swarms, DNPSO is able to provide a good performance both in static multi-modal and dynamic optimization problems. In this paper, the performance of DNPSO is investigated using two different dynamic function generators. Experimental results for three different dynamic environments show that DNPSO is a suitable algorithm for these types of problems and can completely outperform PSO.
    In this paper, a new variant of the PSO algorithm called dynamic niching particle swarm optimizer (DNPSO) is proposed. Similar to basic PSO, DNPSO is a global optimization algorithm in which the main population of the particles is divided... more
    In this paper, a new variant of the PSO algorithm called dynamic niching particle swarm optimizer (DNPSO) is proposed. Similar to basic PSO, DNPSO is a global optimization algorithm in which the main population of the particles is divided into some sub-swarms and a group of free particles. A new sub-swarm forming algorithm is proposed. This new form of sub-swarm creation, combined with free particles which implement a cognition-only model of PSO, brings about a great balance between exploration and exploitation characteristics of the standard PSO. DNPSO is tested with some well-known and widely used benchmark functions and the results are compared with several PSO-based multi-modal optimization methods. The results show that in all cases, DNPSO provides the best solutions.
    Automated visual speech analysis and synthesis is becoming a widespread application. The most important stage in these systems is to design an algorithm for lip finding and tracking. These systems normally find the lip area in the first... more
    Automated visual speech analysis and synthesis is becoming a widespread application. The most important stage in these systems is to design an algorithm for lip finding and tracking. These systems normally find the lip area in the first captured image and then try to initialize a ...
    ... for bi-level thresholding, the optimal threshold, t*, is chosen so that the between-class entropy f1(t) is maximized [1]; that is, * 1 0 1 { ()} t L t Arg Max ft ≤ ≤ − = (3) and verified that ... 1 1 * * * 1 2 1 1 1 2 1 0 ,..., 1 { ,... more
    ... for bi-level thresholding, the optimal threshold, t*, is chosen so that the between-class entropy f1(t) is maximized [1]; that is, * 1 0 1 { ()} t L t Arg Max ft ≤ ≤ − = (3) and verified that ... 1 1 * * * 1 2 1 1 1 2 1 0 ,..., 1 { , ,.. M M M M t t L tt t Arg Max f tt t − − − − ≤ 5) ...
    Abstract: There are a lot of wavelet-based approaches in digital image watermarking literature. In these approaches, the main issue is selection method of wavelet sub-bands ’ coefficients and embedding algorithm. In this approach, an... more
    Abstract: There are a lot of wavelet-based approaches in digital image watermarking literature. In these approaches, the main issue is selection method of wavelet sub-bands ’ coefficients and embedding algorithm. In this approach, an entropy-based method is proposed for non-blind watermarking of still gray level images using discrete wavelet transform. In our approach, we have also used the Discrete Wavelet Transform (DWT) feature and Human Visual System (HVS) characteristic in embedding phase of watermarking algorithm. In comparison with well-known DWT based methods and against the existence attacks in literature, it shows better performance. With simple modifications the method can be used for color images and in real time systems.
    ABSTRACT Active contour models are widely used in extracting object boundaries. However, most of these methods usually fail to capture concave boundaries properly and impose high computational cost. In this paper, a new SOM-based active... more
    ABSTRACT Active contour models are widely used in extracting object boundaries. However, most of these methods usually fail to capture concave boundaries properly and impose high computational cost. In this paper, a new SOM-based active contour model which introduces the Conscience and Archiving mechanisms (CASOM) is proposed to extend the Batch SOM method and eliminate its deficiencies. The performance of the proposed method is evaluated by some experiments on a set of grayscale images. Experimental results are compared with those of the BSOM in terms of accuracy and convergence speed. The results reveal that compared to BSOM, the proposed method requires less computations for converging to the object boundaries and extracts the boundaries of complex objects more accurately, even in the presence of weak or broken edges.
    In this paper, transliteration is carried out by using the attention-based approach in deep learning. Unlike the previous works which randomly initialize the weights in the encoder, word vector representation of the source vocabulary has... more
    In this paper, transliteration is carried out by using the attention-based approach in deep learning. Unlike the previous works which randomly initialize the weights in the encoder, word vector representation of the source vocabulary has been used as an initial value for the weights. The representation is computed by counting the co-occurrences between different characters. Experimental results on an English to Persian transliteration corpus with more than 14000 word pairs show the superior performance of the proposed method (up to 4.21 BLEU points improvement) over the basic attention-based approach.
    LSB method is one of the well-known steganography methods which hides the message bits into the least significant bit of pixel values. This method changes the statistical information of images, which causes to have an unsecured channel.... more
    LSB method is one of the well-known steganography methods which hides the message bits into the least significant bit of pixel values. This method changes the statistical information of images, which causes to have an unsecured channel. To increase the security of this method against the steganalysis methods, in this paper an adaptive method for hiding data into images will be proposed. So, the amount of data and the method which is used for hiding data in each area of image will be different. Experimental results show that the security of the proposed method is higher than general LSB method and in some cases the capacity of the carrier signal is increased.
    Abstract In this paper, a new estimation of distribution algorithm is introduced. The goal is to propose a method that avoids complex approximations of learning a probabilistic graphical model and considers multivariate dependencies... more
    Abstract In this paper, a new estimation of distribution algorithm is introduced. The goal is to propose a method that avoids complex approximations of learning a probabilistic graphical model and considers multivariate dependencies between continuous random variables. A parallel model of some subgraphs with a smaller number of variables is learned as the probabilistic graphical model. In each generation, the joint probability distribution of the selected solutions is estimated using a Gaussian Mixture model. Then, learning the graphical model of dependencies among random variables and sampling are done separately for each Gaussian component. In the learning step, using the selected solutions of each Gaussian mixture component, the structure of a Markov network is learned. This network is decomposed to maximal cliques and a clique graph. Then, complete Bayesian network structures are learned for these subgraphs using an optimization algorithm. The proposed optimization problem is a 0–1 constrained quadratic programming which finds the best permutation of variables. Then, sampling is done from each Bayesian network of each Gaussian component. The introduced method is compared with the other network-based estimation of distribution algorithms for optimization of continuous numerical functions.
    This paper investigates the use of time-adaptive self-organizing map (TASOM)-based active contour models (ACMs) for detecting the boundaries of the human eye sclera and tracking its movements in a sequence of images. The task begins with... more
    This paper investigates the use of time-adaptive self-organizing map (TASOM)-based active contour models (ACMs) for detecting the boundaries of the human eye sclera and tracking its movements in a sequence of images. The task begins with extracting the head boundary ...
    This study introduces an effective solution for text-independent writer identification by generalising contour-hinge feature, which is called n -tuple direction feature. For extracting n -tuple direction feature, the authors first obtain... more
    This study introduces an effective solution for text-independent writer identification by generalising contour-hinge feature, which is called n -tuple direction feature. For extracting n -tuple direction feature, the authors first obtain all contours from connected components, then n + 1 points are considered on the contour with a certain distance apart, and next, the directions of the fragments connecting two successive points are computed. The n + 1 points move on the contour and the n -dimensional histogram of directions is computed. The proposed method is evaluated on large Farsi and English databases. A correct writer identification rate of 92.2% for English handwritings from 900 persons and 97.7% for Farsi handwritings from 600 persons are achieved. Comparison between the proposed method and other studies shows the promising performance and superiority of the proposed method.
    ABSTRACT In this paper, we propose a new traffic surveillance system with the ability to perform surveillance tasks in real time. The proposed classification method is able to classify objects into vehicles and non-vehicles (pedestrians... more
    ABSTRACT In this paper, we propose a new traffic surveillance system with the ability to perform surveillance tasks in real time. The proposed classification method is able to classify objects into vehicles and non-vehicles (pedestrians and motorcycles). In addition, the system can detect the type of vehicle as large or small efficiently, without considering size-based features. Our tracking algorithm uses a region-based tracker to explicitly define occlusion relationships between vehicles. For occlusion handling, we use a Kalman filter to estimate the position of moving vehicles and a tree structure by which moving regions are arranged in a tree. In this way, we obtain robust motion estimates and trajectories for vehicles, even in presence of occlusions. We show the efficient performance of the proposed system in some experiments with real world traffic scenes.
    ABSTRACT Segmentation of moving objects in an image sequence is one of the most fundamental and crucial steps in visual surveillance applications. This paper proposes a novel and efficient method for detecting moving objects in a noisy... more
    ABSTRACT Segmentation of moving objects in an image sequence is one of the most fundamental and crucial steps in visual surveillance applications. This paper proposes a novel and efficient method for detecting moving objects in a noisy background by using a growing self organizing map to construct the codebook. The segmentation process distinguishes between those parts of the objects which move on static and dynamic background spaces such as roads and waving trees, respectively. The advantage of the proposed method is creating a small codebook based on the input pattern to model the background which results in less computational complexity and increases the speed of segmentation. We compare the proposed method with three other background subtraction algorithms and show that the proposed method has a higher precision and detection rate in comparison with other methods.
    Standard databases provide for evaluation and comparison of various pattern recognition techniques by different researchers; thus they are essential for the advance of research. There are different handwritten databases in various... more
    Standard databases provide for evaluation and comparison of various pattern recognition techniques by different researchers; thus they are essential for the advance of research. There are different handwritten databases in various languages, but there is not a large standard database of handwritten text for the evaluation of different algorithms for writer identification and verification in Farsi. This paper introduces a large handwritten Farsi text database called HaFT. The database contains 1800 gray scale images of unconstrained text written by 600 writers. Each participant gave three separate eight-line samples of his handwriting, each of which was written at a different time on a separate sheet. HaFT is presented in several versions each including different lengths of text and using identical or different writing instruments. A new measure, called CVM, is defined which effectively reflects the size of handwriting and thus the content volume of a given text image. This database is designed for training and testing Farsi writer identification and verification using handwritten text. In addition, the database can also be used in training and testing handwritten Farsi text segmentation and recognition algorithms. HaFT is available for research use.
    In this paper, an improved active contour model based on the time-adaptive self-organizing map with a high convergence speed and low computational complexity is proposed. For this purpose, the active contour model based on the original... more
    In this paper, an improved active contour model based on the time-adaptive self-organizing map with a high convergence speed and low computational complexity is proposed. For this purpose, the active contour model based on the original time-adaptive self-organizing ...
    ... For example, error backpropagation method for SNN [2], Kohonen self-organizing layer [19] and other self-organizing networks [3] with spiking neurons, spiking Hopfield networks [12], associative memories with spiking neurons [20], the... more
    ... For example, error backpropagation method for SNN [2], Kohonen self-organizing layer [19] and other self-organizing networks [3] with spiking neurons, spiking Hopfield networks [12], associative memories with spiking neurons [20], the RBF networks of spiking neurons [15], [8 ...
    ABSTRACT Video inpainting methods has a large number of applications and some of these algorithms are specialized for specific applications such as logo removal. There are only a few general video inpainting algorithms most of which are... more
    ABSTRACT Video inpainting methods has a large number of applications and some of these algorithms are specialized for specific applications such as logo removal. There are only a few general video inpainting algorithms most of which are very time-consuming. This problem makes these algorithms unsuitable for fast video inpainting. In this paper, a fast simple logo removal algorithm has been proposed which uses frames of each video shot for logo removal and removes logo from video after a few iterations. A more accurate non-casual version of our algorithm is also proposed which uses both the information of previous and next frames. The quality of the inpainted video is also comparable with well-known video inpainting algorithms.
    We have developed a model to represent the differential operation of block ciphers in order to help finding differential characteristics. Through this model, the whole space of differential characteristics for a block cipher is... more
    We have developed a model to represent the differential operation of block ciphers in order to help finding differential characteristics. Through this model, the whole space of differential characteristics for a block cipher is represented by a multi-level weighted directed graph. In this way, the problem of finding the best differential characteristic for a block cipher reduces to the problem
    In this paper, we present a fast, simple and very powerful method for identifying human beings based on features of their iris texture. A very simple approach is presented to extract texture features of highly random iris texture on the... more
    In this paper, we present a fast, simple and very powerful method for identifying human beings based on features of their iris texture. A very simple approach is presented to extract texture features of highly random iris texture on the contrary to current approaches that use complex mathematical description of the iris texture for feature extraction. The proposed method is
    Research Interests:
    ABSTRACT
    Automated visual speech analysis and synthesis is becoming a widespread application. The most important stage in these systems is to design an algorithm for lip finding and tracking. These systems normally find the lip area in the first... more
    Automated visual speech analysis and synthesis is becoming a widespread application. The most important stage in these systems is to design an algorithm for lip finding and tracking. These systems normally find the lip area in the first captured image and then try to initialize a ...

    And 91 more