Hussain et al., 2007 - Google Patents
Developing lexicographic sorting: An example for UrduHussain et al., 2007
- Document ID
- 6501414784248452755
- Author
- Hussain S
- Gul S
- Waseem A
- Publication year
- Publication venue
- ACM Transactions on Asian Language Information Processing (TALIP)
External Links
Snippet
Collation or lexicographic sorting is essential to develop multilingual computing. This paper presents the challenges faced in developing collation sequence for a language. The paper discusses both theoretical linguistic and practical standardization and encoding related …
- 235000021171 collation 0 abstract description 74
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/28—Processing or translating of natural language
- G06F17/2809—Data driven translation
- G06F17/2827—Example based machine translation; Alignment
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/28—Processing or translating of natural language
- G06F17/289—Use of machine translation, e.g. multi-lingual retrieval, server side translation for client devices, real-time translation
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Palmer | Tokenisation and sentence segmentation | |
Mair | What is a Chinese" dialect/topolect"?: Reflections on some key Sino-English linguistic terms | |
KR100259407B1 (en) | Keyboard for a system and method for processing chinese language text | |
KR100962956B1 (en) | Optimal Input Method of Binary Operation Code for World Character Information and Its Information Processing System | |
JP2016186805A (en) | Modular system and method for managing language data in chinese, japanese and korean in electronic mode | |
CN1558341A (en) | Chinese character / pin yin / english translator | |
Kirov et al. | Context-aware transliteration of romanized south asian languages | |
Lu | Computers and Chinese writing systems | |
Gasser | Semitic morphological analysis and generation using finite state transducers with feature structures | |
Ohm et al. | Study of tokenization strategies for the santhali language | |
Hussain et al. | Developing lexicographic sorting: An example for Urdu | |
Halpern | The challenges and pitfalls of Arabic romanization and arabization | |
Hussain et al. | Survey of language computing in Asia | |
CN103246354A (en) | Inputting method for encoding and expressing Chinese characters through common language characters and keyboards of inputting method | |
Ziegler | The automatic identification of languages using linguistic recognition signals | |
CN1018205B (en) | Chinese voice-digit coding input technique for computer | |
Adams | Internationalization and character set standards | |
Gutkin et al. | Extensions to Brahmic script processing within the Nisaba library: new scripts, languages and utilities | |
Hussain et al. | PAN localization: A study on collation of languages from developing Asia | |
Hussain et al. | Urdu encoding and collation sequence for localization | |
Baiju et al. | Romanized to native malayalam script transliteration using an encoder-decoder framework | |
Jordan | Languages left behind: Keeping Taiwanese off the World Wide Web | |
Joshi et al. | Input Scheme for Hindi Using Phonetic Mapping | |
Wu et al. | On the robustness of cognate generation models | |
Dias et al. | Development of standards for Sinhala computing |