Hussain et al., 2007 - Google Patents

Developing lexicographic sorting: An example for Urdu

Hussain et al., 2007

Document ID: 6501414784248452755
Author: Hussain S; Gul S; Waseem A
Publication year: 2007
Publication venue: ACM Transactions on Asian Language Information Processing (TALIP)

External Links

Cited by

Snippet

Collation or lexicographic sorting is essential to develop multilingual computing. This paper presents the challenges faced in developing collation sequence for a language. The paper discusses both theoretical linguistic and practical standardization and encoding related …

Continue reading at dl.acm.org (other versions)

235000021171 collation 0 abstract description 74

Classifications

- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/28—Processing or translating of natural language
- G06F17/2809—Data driven translation
- G06F17/2827—Example based machine translation; Alignment
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/28—Processing or translating of natural language
- G06F17/289—Use of machine translation, e.g. multi-lingual retrieval, server side translation for client devices, real-time translation

Similar Documents

Publication	Publication Date	Title
Palmer	2000	Tokenisation and sentence segmentation
Mair	1991	What is a Chinese" dialect/topolect"?: Reflections on some key Sino-English linguistic terms
KR100259407B1 (en)	2000-06-15	Keyboard for a system and method for processing chinese language text
KR100962956B1 (en)	2010-06-10	Optimal Input Method of Binary Operation Code for World Character Information and Its Information Processing System
JP2016186805A (en)	2016-10-27	Modular system and method for managing language data in chinese, japanese and korean in electronic mode
CN1558341A (en)	2004-12-29	Chinese character / pin yin / english translator
Kirov et al.	2024	Context-aware transliteration of romanized south asian languages
Lu	2019	Computers and Chinese writing systems
Gasser	2009	Semitic morphological analysis and generation using finite state transducers with feature structures
Ohm et al.	2024	Study of tokenization strategies for the santhali language
Hussain et al.	2007	Developing lexicographic sorting: An example for Urdu
Halpern	2007	The challenges and pitfalls of Arabic romanization and arabization
Hussain et al.	2005	Survey of language computing in Asia
CN103246354A (en)	2013-08-14	Inputting method for encoding and expressing Chinese characters through common language characters and keyboards of inputting method
Ziegler	1991	The automatic identification of languages using linguistic recognition signals
CN1018205B (en)	1992-09-09	Chinese voice-digit coding input technique for computer
Adams	1993	Internationalization and character set standards
Gutkin et al.	2022	Extensions to Brahmic script processing within the Nisaba library: new scripts, languages and utilities
Hussain et al.	2008	PAN localization: A study on collation of languages from developing Asia
Hussain et al.	0	Urdu encoding and collation sequence for localization
Baiju et al.	2024	Romanized to native malayalam script transliteration using an encoder-decoder framework
Jordan	2002	Languages left behind: Keeping Taiwanese off the World Wide Web
Joshi et al.	2012	Input Scheme for Hindi Using Phonetic Mapping
Wu et al.	2022	On the robustness of cognate generation models
Dias et al.	2004	Development of standards for Sinhala computing