[go: up one dir, main page]

Al-Jefri et al., 2013 - Google Patents

Context-sensitive Arabic spell checker using context words and n-gram language models

Al-Jefri et al., 2013

View PDF
Document ID
5052777831272842068
Author
Al-Jefri M
Mahmoud S
Publication year
Publication venue
2013 Taibah University International Conference on Advances in Information Technology for the Holy Quran and Its Sciences

External Links

Snippet

This paper addresses real-word spell checking using context words and n-gram language models. A corpus that consists of different Arabic topics is collected. A collection of confusion sets is normally used in addressing real-word errors. Twenty eight confusion sets are …
Continue reading at www.researchgate.net (PDF) (other versions)

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/3061Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F17/30634Querying
    • G06F17/30657Query processing
    • G06F17/30675Query execution
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • G06F17/27Automatic analysis, e.g. parsing
    • G06F17/2705Parsing
    • G06F17/2715Statistical methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • G06F17/27Automatic analysis, e.g. parsing
    • G06F17/2765Recognition
    • G06F17/277Lexical analysis, e.g. tokenisation, collocates
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • G06F17/28Processing or translating of natural language
    • G06F17/2809Data driven translation
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • G06F17/27Automatic analysis, e.g. parsing
    • G06F17/273Orthographic correction, e.g. spelling checkers, vowelisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/62Methods or arrangements for recognition using electronic means
    • G06K9/68Methods or arrangements for recognition using electronic means using sequential comparisons of the image signals with a plurality of references in which the sequence of the image signals or the references is relevant, e.g. addressable memory
    • G06K9/6807Dividing the references in groups prior to recognition, the recognition taking place in steps; Selecting relevant dictionaries
    • G06K9/6842Dividing the references in groups prior to recognition, the recognition taking place in steps; Selecting relevant dictionaries according to the linguistic properties, e.g. English, German
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • G06F17/21Text processing
    • G06F17/22Manipulating or registering by use of codes, e.g. in sequence of text characters
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/62Methods or arrangements for recognition using electronic means
    • G06K9/6217Design or setup of recognition systems and techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/00852Recognising whole cursive words
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/18Speech classification or search using natural language modelling

Similar Documents

Publication Publication Date Title
Jauhiainen et al. Automatic language identification in texts: A survey
US9037967B1 (en) Arabic spell checking technique
Azmi et al. A survey of automatic Arabic diacritization techniques
Kukich Techniques for automatically correcting words in text
US11023680B2 (en) Method and system for detecting semantic errors in a text using artificial neural networks
Antony et al. Parts of speech tagging for Indian languages: a literature survey
Azmi et al. Real-word errors in Arabic texts: A better algorithm for detection and correction
Alkanhal et al. Automatic stochastic arabic spelling correction with emphasis on space insertions and deletions
Dutta et al. Text normalization in code-mixed social media text
Aung et al. A word sense disambiguation system using Naïve Bayesian algorithm for Myanmar language
Mishra et al. A survey of spelling error detection and correction techniques
Al-Jefri et al. Context-sensitive Arabic spell checker using context words and n-gram language models
Kübler et al. Part of speech tagging for Arabic
Chen et al. Integrating natural language processing with image document analysis: what we learned from two real-world applications
Singh et al. Review of real-word error detection and correction methods in text documents
Jain et al. Detection and correction of non word spelling errors in Hindi language
Huang et al. Exploring representation-learning approaches to domain adaptation
Kaur et al. Hybrid approach for spell checker and grammar checker for Punjabi
Alsayadi et al. Integrating semantic features for enhancing arabic named entity recognition
Kapočiūtė-Dzikienė et al. Character-based machine learning vs. language modeling for diacritics restoration
Tukur et al. Parts-of-speech tagging of Hausa-based texts using hidden Markov model
Mittra et al. A bangla spell checking technique to facilitate error correction in text entry environment
Randhawa et al. Study of spell checking techniques and available spell checkers in regional languages: a survey
Mahafdah et al. Arabic Part of speech Tagging using k-Nearest Neighbour and Naive Bayes Classifiers Combination.
Reddy et al. Named entity recognition on different languages: A survey