Jain et al., 2018 - Google Patents
“UTTAM” an efficient spelling correction system for hindi language based on supervised learningJain et al., 2018
- Document ID
- 13486055547073956713
- Author
- Jain A
- Jain M
- Jain G
- Tayal D
- Publication year
- Publication venue
- ACM Transactions on Asian and Low-Resource Language Information Processing (TALLIP)
External Links
Snippet
In this article, we propose a system called “UTTAM,” for correcting spelling errors in Hindi language text using supervised learning. Unlike other languages, Hindi contains a large set of characters, words with inflections and complex characters, phonetically similar sets of …
- 238000004422 calculation algorithm 0 abstract description 52
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2765—Recognition
- G06F17/277—Lexical analysis, e.g. tokenisation, collocates
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/21—Text processing
- G06F17/22—Manipulating or registering by use of codes, e.g. in sequence of text characters
- G06F17/2217—Character encodings
- G06F17/2223—Handling non-latin characters, e.g. kana-to-kanji conversion
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2705—Parsing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3061—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F17/30634—Querying
- G06F17/30657—Query processing
- G06F17/30675—Query execution
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/274—Grammatical analysis; Style critique
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/28—Processing or translating of natural language
- G06F17/2809—Data driven translation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/28—Processing or translating of natural language
- G06F17/2863—Processing of non-latin text
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2795—Thesaurus; Synonyms
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/21—Text processing
- G06F17/24—Editing, e.g. insert/delete
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/28—Processing or translating of natural language
- G06F17/2872—Rule based translation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3061—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F17/30613—Indexing
- G06F17/30619—Indexing indexing structures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/68—Methods or arrangements for recognition using electronic means using sequential comparisons of the image signals with a plurality of references in which the sequence of the image signals or the references is relevant, e.g. addressable memory
- G06K9/6807—Dividing the references in groups prior to recognition, the recognition taking place in steps; Selecting relevant dictionaries
- G06K9/6842—Dividing the references in groups prior to recognition, the recognition taking place in steps; Selecting relevant dictionaries according to the linguistic properties, e.g. English, German
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/00852—Recognising whole cursive words
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Benajiba et al. | Arabic named entity recognition: A feature-driven study | |
Toledo et al. | Information extraction from historical handwritten document images with a context-aware neural model | |
CN108124477B (en) | Improving word segmenters to process natural language based on pseudo data | |
Antony et al. | Parts of speech tagging for Indian languages: a literature survey | |
Jain et al. | “UTTAM” an efficient spelling correction system for hindi language based on supervised learning | |
CN110162771B (en) | Event trigger word recognition method and device and electronic equipment | |
Azmi et al. | Real-word errors in Arabic texts: A better algorithm for detection and correction | |
CN102982021A (en) | Method for disambiguating multiple readings in language conversion | |
Masmoudi et al. | Transliteration of Arabizi into Arabic script for Tunisian dialect | |
Boros et al. | Assessing the impact of OCR noise on multilingual event detection over digitised documents | |
Etaiwi et al. | Statistical Arabic name entity recognition approaches: A survey | |
Mosavi Miangah | FarsiSpell: A spell-checking system for Persian using a large monolingual corpus | |
Habib et al. | An exploratory approach to find a novel metric based optimum language model for automatic Bangla word prediction | |
Sabty et al. | Language identification of intra-word code-switching for arabic–english | |
Liebeskind et al. | Deep learning for period classification of historical Hebrew texts | |
Onyenwe et al. | Toward an effective igbo part-of-speech tagger | |
Mekki et al. | Tokenization of Tunisian Arabic: A comparison between three machine learning models | |
Banerjee et al. | Named entity recognition on code-mixed cross-script social media content | |
Mazitov et al. | Named entity recognition in Russian using multi-task LSTM-CRF | |
Khorjuvenkar et al. | Parts of speech tagging for Konkani language | |
Shirko | Part of speech tagging for wolaita language using transformation based learning (tbl) approach | |
Celikyilmaz et al. | An empirical investigation of word class-based features for natural language understanding | |
Algahtani | Arabic named entity recognition: A corpus-based study | |
Muhamad et al. | Proposal: A hybrid dictionary modelling approach for Malay tweet normalization | |
Momand et al. | A comparative study of dictionary-based and machine learning-based named entity recognition in Pashto |