[go: up one dir, main page]

Héja, 2010 - Google Patents

The Role of Parallel Corpora in Bilingual Lexicography.

Héja, 2010

View PDF
Document ID
10117826592182191070
Author
Héja E
Publication year
Publication venue
LREC

External Links

Snippet

This paper describes an approach based on word alignment on parallel corpora, which aims at facilitating the lexicographic work of dictionary building. Although this method has been widely used in the MT community for at least 16 years, as far as we know, it has not been …
Continue reading at lexitron.nectec.or.th (PDF) (other versions)

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • G06F17/28Processing or translating of natural language
    • G06F17/2809Data driven translation
    • G06F17/2827Example based machine translation; Alignment
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • G06F17/28Processing or translating of natural language
    • G06F17/289Use of machine translation, e.g. multi-lingual retrieval, server side translation for client devices, real-time translation
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • G06F17/27Automatic analysis, e.g. parsing
    • G06F17/2765Recognition
    • G06F17/277Lexical analysis, e.g. tokenisation, collocates
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • G06F17/28Processing or translating of natural language
    • G06F17/2872Rule based translation
    • G06F17/2881Natural language generation
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • G06F17/27Automatic analysis, e.g. parsing
    • G06F17/2705Parsing
    • G06F17/271Syntactic parsing, e.g. based on context-free grammar [CFG], unification grammars
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/3061Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F17/30634Querying
    • G06F17/30657Query processing
    • G06F17/3066Query translation
    • G06F17/30669Translation of the query language, e.g. Chinese to English
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • G06F17/27Automatic analysis, e.g. parsing
    • G06F17/2785Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • G06F17/27Automatic analysis, e.g. parsing
    • G06F17/274Grammatical analysis; Style critique
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • G06F17/21Text processing
    • G06F17/22Manipulating or registering by use of codes, e.g. in sequence of text characters
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/30286Information retrieval; Database structures therefor; File system structures therefor in structured data stores

Similar Documents

Publication Publication Date Title
Banea et al. A bootstrapping method for building subjectivity lexicons for languages with scarce resources.
Vyas et al. Pos tagging of english-hindi code-mixed social media content
JP3906356B2 (en) Syntax analysis method and apparatus
Othman et al. English-asl gloss parallel corpus 2012: Aslg-pc12
Volk et al. Machine translation of TV subtitles for large scale production
Mititelu et al. CoRoLa―The Reference Corpus of Contemporary Romanian Language.
Gupta et al. Problems with automating translation of movie/tv show subtitles
Dayter Collocations in non-interpreted and simultaneously interpreted English: a corpus study
Popović On reducing translation shifts in translations intended for MT evaluation
Héja The Role of Parallel Corpora in Bilingual Lexicography.
Popović Evaluating conjunction disambiguation on English-to-German and French-to-German WMT 2019 translation hypotheses
Sanjaya et al. Analysis of Category Shift on Emma Heesters’s Cover Song Lyrics on Youtube
Li et al. Uzbek-English and Turkish-English morpheme alignment corpora
Jian et al. TANGO: Bilingual collocational concordancer
Marujo et al. BP2EP-adaptation of Brazilian Portuguese texts to European Portuguese
Skadiņa et al. Latvian Language in the Digital Age: The Main Achievements in the Last Decade.
Volk The automatic translation of film subtitles. A machine translation success story?
Hamed et al. A survey of code-switched Arabic NLP: Progress, challenges, and future directions
Héja et al. Dictionary building based on parallel corpora and word alignment
Arkhangelskiy et al. Sound-aligned corpus of Udmurt dialectal texts
Srdanović Corpus-based collocation research targeted at Japanese language learners
Weller-Di Marco et al. Modeling complement types in phrase-based smt
Burchardt et al. Machine translation quality in an audiovisual context
Song et al. Entity Translation and Alignment in the ACE-07 ET Task.
Héja et al. An online dictionary browser for automatically generated bilingual dictionaries