Skip to main content

    Jean-christophe Burie

    National audienc
    International audienc
    International audienc
    Research Interests:
    International audienc
    Research Interests:
    ABSTRACT
    Identity documents recognition is an important sub-field of document analysis, which deals with tasks of robust document detection, type identification, text fields recognition, as well as identity fraud prevention and document... more
    Identity documents recognition is an important sub-field of document analysis, which deals with tasks of robust document detection, type identification, text fields recognition, as well as identity fraud prevention and document authenticity validation given photos, scans, or video frames of an identity document capture. Significant amount of research has been published on this topic in recent years, however a chief difficulty for such research is scarcity of datasets, due to the subject matter being protected by security requirements. A few datasets of identity documents which are available lack diversity of document types, capturing conditions, or variability of document field values. In this paper, we present a dataset MIDV-2020 which consists of 1000 video clips, 2000 scanned images, and 1000 photos of 1000 unique mock identity documents, each with unique text field values and unique artificially generated faces, with rich annotation. The dataset contains 72409 annotated images i...
    Various government and commercial services, including, but not limited to, e-government, fintech, banking, and sharing economy services, widely use smartphones to simplify service access and user authorization. Many organizations involved... more
    Various government and commercial services, including, but not limited to, e-government, fintech, banking, and sharing economy services, widely use smartphones to simplify service access and user authorization. Many organizations involved in these areas use identity document analysis systems in order to improve user personal-data-input processes. The tasks of such systems are not only ID document data recognition and extraction but also fraud prevention by detecting document forgery or by checking whether the document is genuine. Modern systems of this kind are often expected to operate in unconstrained environments. A significant amount of research has been published on the topic of mobile ID document analysis, but the main difficulty for such research is the lack of public datasets due to the fact that the subject is protected by security requirements. In this paper, we present the DLC-2021 dataset, which consists of 1424 video clips captured in a wide range of real-world conditio...
    Motivated by increasing possibility of the tampering of genuine documents during a transmission over digital channels, we focus on developing a watermarking framework for determining whether a given document is genuine or falsified. The... more
    Motivated by increasing possibility of the tampering of genuine documents during a transmission over digital channels, we focus on developing a watermarking framework for determining whether a given document is genuine or falsified. The proposed framework is performed by hiding a security feature or secret information within the document. In order to hide the security feature, we replace the appropriate characters of legal document by the equivalent characters coming from generated fonts, called hereafter the variations of characters. These variations are produced by training generative adversarial networks (GAN) with the features of character's skeleton and normal shape. Regarding the process of detecting hidden information, we make use of fully convolutional networks (FCN) to produce salient regions from the watermarked document. The salient regions mark positions of document where the characters are substituted by their variations, and these positions are used as a reference for extracting the hidden information. Lastly, we demonstrate that our approach gives high precision of data detection, and competitive performance compared to state-of-the-art approaches.
    <strong>ICDAR2015 competition on smartphone document capture and OCR (SmartDoc)</strong> <strong>Challenge 2: MOBILE OCR COMPETITION</strong> The goal of the competition is to extract the textual content from... more
    <strong>ICDAR2015 competition on smartphone document capture and OCR (SmartDoc)</strong> <strong>Challenge 2: MOBILE OCR COMPETITION</strong> The goal of the competition is to extract the textual content from document images which are captured by mobile phones. The images are taken under varying conditions to provide a challenging input. The dataset was prepared for ICDAR2015-SmartDoc competition. For more details about the dataset please visit the competition's website: https://sites.google.com/site/icdar15smartdoc/home http://smartdoc.univ-lr.fr You may also refer to the following paper for more details on the ICDAR2015-SmartDoc competition: Jean-Christophe Burie, Joseph Chazalon, Mickaël Coustaty, Sébastien Eskenazi, Muhammad Muzzamil Luqman, Maroua Mehri, Nibal Nayef, Jean-Marc OGIER, Sophea Prum and Marçal Rusinol: "ICDAR2015 Competition on Smartphone Document Capture and OCR (SmartDoc)", In 13th International Conference on Document Analysis and Recognition (ICDAR), 2015. <strong>If you use this dataset, please send us a short email at <icdar.smartdoc (at) gmail.com> to tell us why it was useful to you, and whether you have results or publications we can reference on our website. Thank you!</strong>
    Data hiding is an effective technique, compared to pervasive black-and-white code patterns such as barcode and quick response code, which can be used to secure document images against forgery or unauthorized intervention. In this work, we... more
    Data hiding is an effective technique, compared to pervasive black-and-white code patterns such as barcode and quick response code, which can be used to secure document images against forgery or unauthorized intervention. In this work, we propose a robust digital watermarking scheme for securing genuine documents by leveraging generative adversarial networks (GAN). To begin with, the input document is adjusted to its right form by geometric correction. Next, the generated document is obtained from the input document by using the mentioned networks, and it is regarded as a reference for data hiding and detection. We then introduce an algorithm that hides a secret information into the document and produces a watermarked document whose content is minimally distorted in terms of normal observation. Furthermore, we also present a method that detects the hidden data from the watermarked document by measuring the distance of pixel values between the generated and watermarked document. For improving the security feature, we encode the secret information prior to hiding it by using pseudo random numbers. Lastly, we demonstrate that our approach gives high precision of data detection, and competitive performance compared to state-of-the-art approaches.
    Ancient printed documents are an infinite source of knowledge, but digital uses are usually complicated due to the age and the quality of the print. The Linguistic Atlas of France (ALF) maps are composed of printed phonetic words used to... more
    Ancient printed documents are an infinite source of knowledge, but digital uses are usually complicated due to the age and the quality of the print. The Linguistic Atlas of France (ALF) maps are composed of printed phonetic words used to locate how words were pronounced over the country. Those words were printed using the Rousselot-Gillieron alphabet (extension of Latin alphabet) which bring character recognition problems due to the large number of diacritics. In this paper, we propose a phonetic character recognition process based on a space-filling curves approach. We proposed an original method adapted to this particular data set, able to finely classify, with more than 70% of accuracy, noisy and specific characters.
    This paper presents an algorithm for parametric supervised colour texture segmentation using a novel image observation model. The proposed segmentation algorithm consists of two phases: In the first phase, we estimate an initial class... more
    This paper presents an algorithm for parametric supervised colour texture segmentation using a novel image observation model. The proposed segmentation algorithm consists of two phases: In the first phase, we estimate an initial class label field of the image based on a 2D multichannel complex linear prediction model. Information of both luminance and chrominance spatial variation feature cues are used to characterize colour textures. Complex multichannel version of 2D Quarter Plane Autoregressive model is used to model these spatial variations of colour texture images in CIE L*a*b* colour space. Overall colour distribution of the image is estimated from the multichannel prediction error sequence of this Autoregressive model. Another significant contribution of this paper is the modelling of this multichannel error sequence using Multivariate Gaussian Mixture Model instead of a single Gaussian probability. Gaussian parameters are calculated through Expectation Maximization on a trai...
    Benefiting from the joint learning of the multiple tasks in the deep multi-task networks, many applications have shown the promising performance comparing to single-task learning. However, the performance of multi-task learning framework... more
    Benefiting from the joint learning of the multiple tasks in the deep multi-task networks, many applications have shown the promising performance comparing to single-task learning. However, the performance of multi-task learning framework is highly dependant on the relative weights of the tasks. How to assign the weight of each task is a critical issue in the multi-task learning. Instead of tuning the weights manually which is exhausted and time-consuming, in this paper we propose an approach which can dynamically adapt the weights of the tasks according to the difficulty for training the task. Specifically, the proposed method does not introduce the hyperparameters and the simple structure allows the other multi-task deep learning networks can easily realize or reproduce this method. We demonstrate our approach for face recognition with facial expression and facial expression recognition from a single input image based on a deep multi-task learning Conventional Neural Networks (CNNs...
    The effectiveness of the state-of-the-art face verifi-cation/recognition algorithms and the convenience of face recognition greatly boost the face-related biometric authentication applications. However, existing face verification... more
    The effectiveness of the state-of-the-art face verifi-cation/recognition algorithms and the convenience of face recognition greatly boost the face-related biometric authentication applications. However, existing face verification architectures seldom integrate any liveness detection or keep such stage isolated from face verification as if it was irrelevant. This may potentially result in the system being exposed to spoof attacks between the two stages. This work introduces FaceLiveNet, a holistic end-to-end deep networks which can perform face verification and liveness detection simultaneously. An interactive scheme for facial expression recognition is proposed to perform liveness detection, providing better generalization capacity and higher security level. The proposed framework is low-cost as it relies on commodity hardware instead of costly sensors, and lightweight with much fewer parameters comparing to the other popular deep networks such as VGG16 and FaceNet. Experimental results on the benchmarks LFW, YTF, CK+, OuluCASIA, SFEW, FER2013 demonstrate that the proposed FaceLiveNet can achieve state-of-art performance or better for both face verification and facial expression recognition. We also introduce a new protocol to evaluate the global performance for face authentication with the fusion of face verification and interactive facial expression-based liveness detection.
    We address in this paper the issue of stitching non overlapping pairs of real digitized comic books' pages. The main objective is to be able to decide if two pages are meant to be displayed together for their content to be understood.... more
    We address in this paper the issue of stitching non overlapping pairs of real digitized comic books' pages. The main objective is to be able to decide if two pages are meant to be displayed together for their content to be understood. First, the relevant content is separated from the background so the pairing is done over relevant pieces of visual information. We define the different kinds of noise that one can found on the edges of such documents and how to get rid of it. Then we propose a method to decide whether a couple of pages should be paired or not, based on the analysis of the relevant content. Our compatibility methods are evaluated against the methods from the literature, adapted from jigsaw puzzles solvers. Results are discussed over an actual commercial dataset of digitized comic books.
    Reading the text embedded in natural scene images is essential to many applications. In this paper, we propose a method for detecting text in scene images based on multi-level connected component (CC) analysis and learning text component... more
    Reading the text embedded in natural scene images is essential to many applications. In this paper, we propose a method for detecting text in scene images based on multi-level connected component (CC) analysis and learning text component features via convolutional neural networks (CNN), followed by a graph-based grouping of overlapping text boxes. The multi-level CC analysis allows the extraction of redundant text and non-text components at multiple binarization levels to minimize the loss of any potential text candidates. The features of the resulting raw text/non-text components of different granularity levels are learned via a CNN. Those two modules eliminate the need for complex ad-hoc preprocessing steps for finding initial candidates, and the need for hand-designed features to classify such candidates into text or non-text. The components classified as text at different granularity levels, are grouped in a graph based on the overlap of their extended bounding boxes, then, the connected graph components are retained. This eliminates redundant text components and forms words or textlines. When evaluated on the "Robust Reading Competition" dataset for natural scene images, our method achieved better detection results compared to state-of-the-art methods. In addition to its efficacy, our method can be easily adapted to detect multi-oriented or multi-lingual text as it operates at low level initial components, and it does not require such components to be characters.
    In order to preserve the Sundanese palm leaf manuscripts, some digitization campaigns have been done recently. Then, for further access in education and research, the handwritten Sundanese palm leaf manuscript dataset called Lontar Sunda... more
    In order to preserve the Sundanese palm leaf manuscripts, some digitization campaigns have been done recently. Then, for further access in education and research, the handwritten Sundanese palm leaf manuscript dataset called Lontar Sunda dataset has been created. The dataset was constructed from 66 pages of 27 collections of Sundanese palm leaf manuscripts from the 15th century. The dataset has been carried out with manuscripts from Garut, West Java, Indonesia. This paper presents the Sundanese dataset which is publicly available for scientific use. The groundtruth includes binarized images, annotations at word level and annotations at character level. The Sundanese dataset provides useful data to test word spotting, character/symbol recognition and binarization methods, and will facilitate the evaluation of developed methods.
    Among comic elements including panels, balloons, comic characters, texts, etc. panels play an important role in content adaptation and story animation for small devices such as mobile phones or tablets. Different panel extraction... more
    Among comic elements including panels, balloons, comic characters, texts, etc. panels play an important role in content adaptation and story animation for small devices such as mobile phones or tablets. Different panel extraction techniques have been investigated over the last ten years; most of the existing approaches rely on the assumption that a comic panel is either simple as a rectangle or more complex as a polygon having 4 edges. In this paper, we re-examine the definition of comic panels, is a 4-edge polygon really sufficient to represent integral information of a comic panel? We suggest using a modern definition of comic panels, together with a strong panel extraction baseline method for the approach.
    Abstract. Due to their specific characteristics, palm leaf manuscripts provide new challenges for text line segmentation tasks in document analysis. We investigated the performance of six text line segmentation methods by conducting... more
    Abstract. Due to their specific characteristics, palm leaf manuscripts provide new challenges for text line segmentation tasks in document analysis. We investigated the performance of six text line segmentation methods by conducting comparative experimental studies for the collection of palm leaf manuscript images. The image corpus used in this study comes from the sample images of palm leaf manuscripts of three different Southeast Asian scripts: Balinese script from Bali and Sundanese script from West Java, both from Indonesia, and Khmer script from Cambodia. For the experiments, four text line segmentation methods that work on binary images are tested: the adaptive partial projection line segmentation approach, the A* path planning approach, the shredding method, and our proposed energy function for shredding method. Two other methods that can be directly applied on grayscale images are also investigated: the adaptive local connectivity map method and the seam carving-based method. The evaluation criteria and tool provided by ICDAR2013 Handwriting Segmentation Contest were used in this experiment.
    ABSTRACT Document analysis is an active field of research, which can attain a complete understanding of the semantics of a given document. One example of the document understanding process is enabling a computer to identify the key... more
    ABSTRACT Document analysis is an active field of research, which can attain a complete understanding of the semantics of a given document. One example of the document understanding process is enabling a computer to identify the key elements of a comic book story and arrange them according to a predefined domain knowledge. In this study, we propose a knowledge-driven system that can interact with bottom-up and top-down information to progressively understand the content of a document. We model the comic book’s and the image processing domains knowledge for information consistency analysis. In addition, different image processing methods are improved or developed to extract panels, balloons, tails, texts, comic characters and their semantic relations in an unsupervised way.
    This paper presents a comparison of parametric and non-parametric models of multichannel linear prediction error for supervised color texture segmentation. Information of both luminance and chrominance spatial variation feature cues are... more
    This paper presents a comparison of parametric and non-parametric models of multichannel linear prediction error for supervised color texture segmentation. Information of both luminance and chrominance spatial variation feature cues are used to characterize color textures. The method presented consists of two steps. In the first step, we estimate the linear prediction errors of color textures computed on small training
    Research Interests:
    Graphs are popular data structures used to model pair wise relations between elements from a given collection. In image processing, adjacency graphs are often used to represent the relations between segmented regions. The comparison of... more
    Graphs are popular data structures used to model pair wise relations between elements from a given collection. In image processing, adjacency graphs are often used to represent the relations between segmented regions. The comparison of such graphs has been largely studied but graph matching strategies are essential to find, efficiently, similar patterns. In this paper, we propose a method to detect the recurring characters in comics books. We would like to draw attention of the reader. In this paper, the term “character” means the protagonists of the story. In our approach, each panel is represented with an attributed adjacency graph. Then, an inexact graph matching strategy is applied to find recurring structures among this set of graphs. The main idea is that the same character will be represented by similar subgraphs in the different panels where it appears. The two-step matching process consists in a node matching step and an edge validation step. Experiments show that our approach is able to detect recurring structures in the graph and consequently the recurrent characters in a comics book. The originality of our approach is that no prior object model is required the characters. The algorithm detects, automatically, all recurring structures corresponding to the main characters of the story.
    ABSTRACT
    ... 2008). Underwater Images Enhancement by Light Propagation Model Reversion.Frédéric Petit 1 , Philippe Blasi 1 , Anne-Sophie Capelle-Laizé 1 , Jean-Christophe Burie 2. For the ICONES;Imedoc collaboration(s). (2008-06). ...
    Text detection and recognition in a natural environment are key components of many applications, ranging from business card digitization to shop indexation in a street. This competition aims at assessing the ability of state-of-the-art... more
    Text detection and recognition in a natural environment are key components of many applications, ranging from business card digitization to shop indexation in a street. This competition aims at assessing the ability of state-of-the-art methods to detect Multi-Lingual Text (MLT) in scene images, such as in contents gathered from the Internet media and in modern cities where multiple cultures live and communicate together. This competition is an extension of the Robust Reading Competition (RRC) which has been held since 2003 both in ICDAR and in an online context. The proposed competition is presented as a new challenge of the RRC. The dataset built for this challenge largely extends the previous RRC editions in many aspects: the multi-lingual text, the size of the dataset, the multi-oriented text, the wide variety of scenes. The dataset is comprised of 18,000 images which contain text belonging to 9 languages. The challenge is comprised of three tasks related to text detection and script classification. We have received a total of 16 participations from the research and industrial communities. This paper presents the dataset, the tasks and the findings of this RRC-MLT challenge.
    Abstract. Comic books constitute an important heritage in many countries. Nowa-days, digitization allows to search directly from content instead of metadata only (e.g. album title or author name). Few studies have been done in this... more
    Abstract. Comic books constitute an important heritage in many countries. Nowa-days, digitization allows to search directly from content instead of metadata only (e.g. album title or author name). Few studies have been done in this direction. Only frame and speech balloon extraction have been experimented in the case of simple page structure. In fact, the page structure depends on the author which is why many different structures and drawings exist. Despite the differences, drawings have a common characteristic because of design process: they are all surrounded by a black line. In this paper, we propose to rely on this particu-larity of comic books to automatically extract frame and text using a connected-component labeling analysis. The approach is compared with some existing meth-ods found in the literature and results are presented.
    Face recognition of realistic visual images has been well studied and made a significant progress in the recent decade. Unlike the realistic visual images, the face recognition of the caricatures is far from the performance of the visual... more
    Face recognition of realistic visual images has been well studied and made a significant progress in the recent decade. Unlike the realistic visual images, the face recognition of the caricatures is far from the performance of the visual images. This is largely due to the extreme non-rigid distortions of the caricatures introduced by exaggerating the facial features to strengthen the characters. The heterogeneous modalities of the caricatures and the visual images result the caricature-visual face recognition is a cross-modal problem. In this paper, we propose a method to conduct caricature-visual face recognition via multi-task learning. Rather than the conventional multi-task learning with fixed weights of tasks, this work proposes an approach to learn the weights of tasks according to the importance of tasks. The proposed multi-task learning with dynamic tasks weights enables to appropriately train the hard task and easy task instead of being stuck in the over-training easy task ...
    Les chercheurs en Humanités numériques intéressés par l’analyse de grands corpus textuels utilisent de nombreuses méthodes et outils issus de domaines informatiques comme le traitement du langage naturel (Piotrowski, 2012) ou l’analyse de... more
    Les chercheurs en Humanités numériques intéressés par l’analyse de grands corpus textuels utilisent de nombreuses méthodes et outils issus de domaines informatiques comme le traitement du langage naturel (Piotrowski, 2012) ou l’analyse de réseaux (Lemercier, 2005). Des méthodes récentes fondées sur les réseaux de neurones présentent également un intérêt majeur. Word2Vec est une méthode qui a grandement facilité l’utilisation de tels modèles (Mikolov, 2013). Les différentes optimisations apportées permettent, très simplement, d’entraın̂er un modèle sur de grandes quantités de données en utilisant un simple ordinateur de bureau. Le code source a été largement diffusé et a rendu cette méthode très populaire, notamment parmi les chercheurs en Humanités numériques. Hamilton a par exemple montré l’intérêt de ces modèles pour analyser l’évolution de certains mots du langage au cours du temps (Hamilton, 2016). Ces méthodes peuvent également être utilisées à d’autres fins. En effet, de nombr...
    This paper presents a comparison of colour spaces including IHLS and L*a*b* for colour texture characterization. Colour information is used to build a two channel image that contains pure luminance values in one channel and complex... more
    This paper presents a comparison of colour spaces including IHLS and L*a*b* for colour texture characterization. Colour information is used to build a two channel image that contains pure luminance values in one channel and complex chrominance values in the other channel. The power spectrum estimation is done using 2D multichannel linear prediction models. A spectral analysis using luminance and chrominance spectra shows that the IHLS colour space presents a more important interference between luminance and chrominance channels than the L*a*b* colour space. The spectra are used to characterize colour textures. Then classification has been carried out in each colour space. Individual as well as combined effect of information from luminance and chrominance structure cues has been used for classification. A better rate classification on the set of colour textures is obtained for L*a*b* colour space.
    To open a wider access to the precious content of historical Balinese palm leaf manuscripts, an appropriate system to transliterate the Balinese script to the Roman script is needed. To achieve this goal, a Balinese glyph recognition... more
    To open a wider access to the precious content of historical Balinese palm leaf manuscripts, an appropriate system to transliterate the Balinese script to the Roman script is needed. To achieve this goal, a Balinese glyph recognition scheme is very important. This scheme needs to be developed by taking into account the degraded condition of palm leaf manuscripts and the complexity of Balinese script. In this paper, we present a complete scheme of spatially categorized glyph recognition for the transliteration of Balinese palm leaf manuscripts. For this scheme, five different categories of glyph recognizers based on the spatial positions on the manuscript are proposed. These recognizers will be used to verify and to validate the recognition result of the global glyph recognizer. Each glyph recognizer is built based on the combination of some feature extraction methods and it is trained on a single layer neural network. The trained network is initialized by an unsupervised feature learning. The output of the glyph recognition scheme will be sent as the input to the phonological transliteration system. The results are evaluated with the ground truth of transliterated text provided by philologists. Our scheme shows a very promising result for Balinese palm leaf manuscripts transliteration and can be adapted to other type of script.

    And 50 more