Al-Jefri et al., 2013 - Google Patents
Context-sensitive Arabic spell checker using context words and n-gram language modelsAl-Jefri et al., 2013
View PDF- Document ID
- 5052777831272842068
- Author
- Al-Jefri M
- Mahmoud S
- Publication year
- Publication venue
- 2013 Taibah University International Conference on Advances in Information Technology for the Holy Quran and Its Sciences
External Links
Snippet
This paper addresses real-word spell checking using context words and n-gram language models. A corpus that consists of different Arabic topics is collected. A collection of confusion sets is normally used in addressing real-word errors. Twenty eight confusion sets are …
- 238000000034 method 0 abstract description 10
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3061—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F17/30634—Querying
- G06F17/30657—Query processing
- G06F17/30675—Query execution
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2705—Parsing
- G06F17/2715—Statistical methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2765—Recognition
- G06F17/277—Lexical analysis, e.g. tokenisation, collocates
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/28—Processing or translating of natural language
- G06F17/2809—Data driven translation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/273—Orthographic correction, e.g. spelling checkers, vowelisation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/68—Methods or arrangements for recognition using electronic means using sequential comparisons of the image signals with a plurality of references in which the sequence of the image signals or the references is relevant, e.g. addressable memory
- G06K9/6807—Dividing the references in groups prior to recognition, the recognition taking place in steps; Selecting relevant dictionaries
- G06K9/6842—Dividing the references in groups prior to recognition, the recognition taking place in steps; Selecting relevant dictionaries according to the linguistic properties, e.g. English, German
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/21—Text processing
- G06F17/22—Manipulating or registering by use of codes, e.g. in sequence of text characters
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6217—Design or setup of recognition systems and techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/00852—Recognising whole cursive words
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Jauhiainen et al. | Automatic language identification in texts: A survey | |
US9037967B1 (en) | Arabic spell checking technique | |
Azmi et al. | A survey of automatic Arabic diacritization techniques | |
Kukich | Techniques for automatically correcting words in text | |
US11023680B2 (en) | Method and system for detecting semantic errors in a text using artificial neural networks | |
Antony et al. | Parts of speech tagging for Indian languages: a literature survey | |
Azmi et al. | Real-word errors in Arabic texts: A better algorithm for detection and correction | |
Alkanhal et al. | Automatic stochastic arabic spelling correction with emphasis on space insertions and deletions | |
Dutta et al. | Text normalization in code-mixed social media text | |
Aung et al. | A word sense disambiguation system using Naïve Bayesian algorithm for Myanmar language | |
Mishra et al. | A survey of spelling error detection and correction techniques | |
Al-Jefri et al. | Context-sensitive Arabic spell checker using context words and n-gram language models | |
Kübler et al. | Part of speech tagging for Arabic | |
Chen et al. | Integrating natural language processing with image document analysis: what we learned from two real-world applications | |
Singh et al. | Review of real-word error detection and correction methods in text documents | |
Jain et al. | Detection and correction of non word spelling errors in Hindi language | |
Huang et al. | Exploring representation-learning approaches to domain adaptation | |
Kaur et al. | Hybrid approach for spell checker and grammar checker for Punjabi | |
Alsayadi et al. | Integrating semantic features for enhancing arabic named entity recognition | |
Kapočiūtė-Dzikienė et al. | Character-based machine learning vs. language modeling for diacritics restoration | |
Tukur et al. | Parts-of-speech tagging of Hausa-based texts using hidden Markov model | |
Mittra et al. | A bangla spell checking technique to facilitate error correction in text entry environment | |
Randhawa et al. | Study of spell checking techniques and available spell checkers in regional languages: a survey | |
Mahafdah et al. | Arabic Part of speech Tagging using k-Nearest Neighbour and Naive Bayes Classifiers Combination. | |
Reddy et al. | Named entity recognition on different languages: A survey |