[go: up one dir, main page]

Jain et al., 2018 - Google Patents

“UTTAM” an efficient spelling correction system for hindi language based on supervised learning

Jain et al., 2018

Document ID
13486055547073956713
Author
Jain A
Jain M
Jain G
Tayal D
Publication year
Publication venue
ACM Transactions on Asian and Low-Resource Language Information Processing (TALLIP)

External Links

Snippet

In this article, we propose a system called “UTTAM,” for correcting spelling errors in Hindi language text using supervised learning. Unlike other languages, Hindi contains a large set of characters, words with inflections and complex characters, phonetically similar sets of …
Continue reading at dl.acm.org (other versions)

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • G06F17/27Automatic analysis, e.g. parsing
    • G06F17/2765Recognition
    • G06F17/277Lexical analysis, e.g. tokenisation, collocates
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • G06F17/21Text processing
    • G06F17/22Manipulating or registering by use of codes, e.g. in sequence of text characters
    • G06F17/2217Character encodings
    • G06F17/2223Handling non-latin characters, e.g. kana-to-kanji conversion
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • G06F17/27Automatic analysis, e.g. parsing
    • G06F17/2705Parsing
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/3061Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F17/30634Querying
    • G06F17/30657Query processing
    • G06F17/30675Query execution
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • G06F17/27Automatic analysis, e.g. parsing
    • G06F17/274Grammatical analysis; Style critique
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • G06F17/28Processing or translating of natural language
    • G06F17/2809Data driven translation
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • G06F17/28Processing or translating of natural language
    • G06F17/2863Processing of non-latin text
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • G06F17/27Automatic analysis, e.g. parsing
    • G06F17/2795Thesaurus; Synonyms
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • G06F17/21Text processing
    • G06F17/24Editing, e.g. insert/delete
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • G06F17/28Processing or translating of natural language
    • G06F17/2872Rule based translation
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/3061Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F17/30613Indexing
    • G06F17/30619Indexing indexing structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/62Methods or arrangements for recognition using electronic means
    • G06K9/68Methods or arrangements for recognition using electronic means using sequential comparisons of the image signals with a plurality of references in which the sequence of the image signals or the references is relevant, e.g. addressable memory
    • G06K9/6807Dividing the references in groups prior to recognition, the recognition taking place in steps; Selecting relevant dictionaries
    • G06K9/6842Dividing the references in groups prior to recognition, the recognition taking place in steps; Selecting relevant dictionaries according to the linguistic properties, e.g. English, German
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/00852Recognising whole cursive words

Similar Documents

Publication Publication Date Title
Benajiba et al. Arabic named entity recognition: A feature-driven study
Toledo et al. Information extraction from historical handwritten document images with a context-aware neural model
CN108124477B (en) Improving word segmenters to process natural language based on pseudo data
Antony et al. Parts of speech tagging for Indian languages: a literature survey
Jain et al. “UTTAM” an efficient spelling correction system for hindi language based on supervised learning
CN110162771B (en) Event trigger word recognition method and device and electronic equipment
Azmi et al. Real-word errors in Arabic texts: A better algorithm for detection and correction
CN102982021A (en) Method for disambiguating multiple readings in language conversion
Masmoudi et al. Transliteration of Arabizi into Arabic script for Tunisian dialect
Boros et al. Assessing the impact of OCR noise on multilingual event detection over digitised documents
Etaiwi et al. Statistical Arabic name entity recognition approaches: A survey
Mosavi Miangah FarsiSpell: A spell-checking system for Persian using a large monolingual corpus
Habib et al. An exploratory approach to find a novel metric based optimum language model for automatic Bangla word prediction
Sabty et al. Language identification of intra-word code-switching for arabic–english
Liebeskind et al. Deep learning for period classification of historical Hebrew texts
Onyenwe et al. Toward an effective igbo part-of-speech tagger
Mekki et al. Tokenization of Tunisian Arabic: A comparison between three machine learning models
Banerjee et al. Named entity recognition on code-mixed cross-script social media content
Mazitov et al. Named entity recognition in Russian using multi-task LSTM-CRF
Khorjuvenkar et al. Parts of speech tagging for Konkani language
Shirko Part of speech tagging for wolaita language using transformation based learning (tbl) approach
Celikyilmaz et al. An empirical investigation of word class-based features for natural language understanding
Algahtani Arabic named entity recognition: A corpus-based study
Muhamad et al. Proposal: A hybrid dictionary modelling approach for Malay tweet normalization
Momand et al. A comparative study of dictionary-based and machine learning-based named entity recognition in Pashto