[go: up one dir, main page]

Kondrak, 2001 - Google Patents

Identifying cognates by phonetic and semantic similarity

Kondrak, 2001

View PDF
Document ID
4175948356455235394
Author
Kondrak G
Publication year
Publication venue
Second Meeting of the North American Chapter of the Association for Computational Linguistics

External Links

Snippet

I present a method of identifying cognates in the vocabularies of related languages. I show that a measure of phonetic similarity based on multivalued features performs better than “orthographic” measures, such as the Longest Common Subsequence Ratio (LCSR) or …
Continue reading at aclanthology.org (PDF) (other versions)

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • G06F17/27Automatic analysis, e.g. parsing
    • G06F17/2705Parsing
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • G06F17/27Automatic analysis, e.g. parsing
    • G06F17/2765Recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • G06F17/27Automatic analysis, e.g. parsing
    • G06F17/274Grammatical analysis; Style critique
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/3061Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F17/30634Querying
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • G06F17/28Processing or translating of natural language
    • G06F17/2809Data driven translation
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • G06F17/28Processing or translating of natural language
    • G06F17/2863Processing of non-latin text
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/50Computer-aided design
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/62Methods or arrangements for recognition using electronic means
    • G06K9/68Methods or arrangements for recognition using electronic means using sequential comparisons of the image signals with a plurality of references in which the sequence of the image signals or the references is relevant, e.g. addressable memory
    • G06K9/6807Dividing the references in groups prior to recognition, the recognition taking place in steps; Selecting relevant dictionaries
    • G06K9/6842Dividing the references in groups prior to recognition, the recognition taking place in steps; Selecting relevant dictionaries according to the linguistic properties, e.g. English, German
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/36Image preprocessing, i.e. processing the image information without deciding about the identity of the image
    • G06K9/46Extraction of features or characteristics of the image

Similar Documents

Publication Publication Date Title
Kondrak Identifying cognates by phonetic and semantic similarity
CN106598937B (en) Language Identification, device and electronic equipment for text
Sporleder et al. Unsupervised recognition of literal and non-literal use of idiomatic expressions
Mackay et al. Computing word similarity and identifying cognates with Pair Hidden Markov Models
WO1997004405A9 (en) Method and apparatus for automated search and retrieval processing
JP2008262587A (en) Example based machine translation system
Darwish et al. Using Stem-Templates to Improve Arabic POS and Gender/Number Tagging.
Adouane et al. Identification of languages in Algerian Arabic multilingual documents
Tedeschi et al. ID10M: Idiom identification in 10 languages
Bedrick et al. Robust kaomoji detection in Twitter
Charoenpornsawat et al. Automatic sentence break disambiguation for Thai
Frunza et al. Identification and disambiguation of cognates, false friends, and partial cognates using machine learning techniques
Vikram et al. Development of prototype morphological analyzer for he south indian language of kannada
Loftsson et al. Developing a PoS-tagged corpus using existing tools
Kondrak Combining evidence in cognate identification
US20110106849A1 (en) New case generation device, new case generation method, and new case generation program
Parida et al. Translating short segments with nmt: A case study in english-to-hindi
Sharma et al. Improving existing punjabi grammar checker
Graham Using natural language processing to search for textual references
Gardner et al. Automatic link detection: a sequence labeling approach
Kulick et al. Parsing Early Modern English for Linguistic Search
Frunza Automatic identification of cognates, false friends, and partial cognates
Garcia et al. A Method to Automatically Identify Diachronic Variation in Collocations.
Tyrkkö et al. Semi-automatic discovery of multilingual elements in English historical corpora: Methods and challenges
Hurskainen Optimizing disambiguation in Swahili