[go: up one dir, main page]

Hussain et al., 2007 - Google Patents

Developing lexicographic sorting: An example for Urdu

Hussain et al., 2007

Document ID
6501414784248452755
Author
Hussain S
Gul S
Waseem A
Publication year
Publication venue
ACM Transactions on Asian Language Information Processing (TALIP)

External Links

Snippet

Collation or lexicographic sorting is essential to develop multilingual computing. This paper presents the challenges faced in developing collation sequence for a language. The paper discusses both theoretical linguistic and practical standardization and encoding related …
Continue reading at dl.acm.org (other versions)

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • G06F17/28Processing or translating of natural language
    • G06F17/2809Data driven translation
    • G06F17/2827Example based machine translation; Alignment
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • G06F17/28Processing or translating of natural language
    • G06F17/289Use of machine translation, e.g. multi-lingual retrieval, server side translation for client devices, real-time translation

Similar Documents

Publication Publication Date Title
Palmer Tokenisation and sentence segmentation
Mair What is a Chinese" dialect/topolect"?: Reflections on some key Sino-English linguistic terms
KR100259407B1 (en) Keyboard for a system and method for processing chinese language text
KR100962956B1 (en) Optimal Input Method of Binary Operation Code for World Character Information and Its Information Processing System
JP2016186805A (en) Modular system and method for managing language data in chinese, japanese and korean in electronic mode
CN1558341A (en) Chinese character / pin yin / english translator
Kirov et al. Context-aware transliteration of romanized south asian languages
Lu Computers and Chinese writing systems
Gasser Semitic morphological analysis and generation using finite state transducers with feature structures
Ohm et al. Study of tokenization strategies for the santhali language
Hussain et al. Developing lexicographic sorting: An example for Urdu
Halpern The challenges and pitfalls of Arabic romanization and arabization
Hussain et al. Survey of language computing in Asia
CN103246354A (en) Inputting method for encoding and expressing Chinese characters through common language characters and keyboards of inputting method
Ziegler The automatic identification of languages using linguistic recognition signals
CN1018205B (en) Chinese voice-digit coding input technique for computer
Adams Internationalization and character set standards
Gutkin et al. Extensions to Brahmic script processing within the Nisaba library: new scripts, languages and utilities
Hussain et al. PAN localization: A study on collation of languages from developing Asia
Hussain et al. Urdu encoding and collation sequence for localization
Baiju et al. Romanized to native malayalam script transliteration using an encoder-decoder framework
Jordan Languages left behind: Keeping Taiwanese off the World Wide Web
Joshi et al. Input Scheme for Hindi Using Phonetic Mapping
Wu et al. On the robustness of cognate generation models
Dias et al. Development of standards for Sinhala computing