[go: up one dir, main page]

Garcia-Martinez et al., 2020 - Google Patents

Addressing data sparsity for neural machine translation between morphologically rich languages

Garcia-Martinez et al., 2020

View PDF
Document ID
30746308892404921
Author
Garcia-Martinez M
Aransa W
Bougares F
Barrault L
Publication year
Publication venue
Machine Translation

External Links

Snippet

Translating between morphologically rich languages is still challenging for current machine translation systems. In this paper, we experiment with various neural machine translation (NMT) architectures to address the data sparsity problem caused by data availability …
Continue reading at eprints.whiterose.ac.uk (PDF) (other versions)

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • G06F17/28Processing or translating of natural language
    • G06F17/2809Data driven translation
    • G06F17/2827Example based machine translation; Alignment
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/3061Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F17/30634Querying
    • G06F17/30657Query processing
    • G06F17/3066Query translation
    • G06F17/30669Translation of the query language, e.g. Chinese to English
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • G06F17/27Automatic analysis, e.g. parsing
    • G06F17/2765Recognition
    • G06F17/277Lexical analysis, e.g. tokenisation, collocates
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • G06F17/27Automatic analysis, e.g. parsing
    • G06F17/2705Parsing
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/3061Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F17/30634Querying
    • G06F17/30657Query processing
    • G06F17/30675Query execution
    • G06F17/30684Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • G06F17/28Processing or translating of natural language
    • G06F17/2872Rule based translation
    • G06F17/2881Natural language generation
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • G06F17/28Processing or translating of natural language
    • G06F17/289Use of machine translation, e.g. multi-lingual retrieval, server side translation for client devices, real-time translation
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • G06F17/21Text processing
    • G06F17/22Manipulating or registering by use of codes, e.g. in sequence of text characters
    • G06F17/2217Character encodings
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • G06F17/27Automatic analysis, e.g. parsing
    • G06F17/2795Thesaurus; Synonyms
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • G06F17/27Automatic analysis, e.g. parsing
    • G06F17/2785Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • G06F17/28Processing or translating of natural language
    • G06F17/2863Processing of non-latin text
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/18Speech classification or search using natural language modelling

Similar Documents

Publication Publication Date Title
Malmi et al. Encode, tag, realize: High-precision text editing
Farahani et al. Parsbert: Transformer-based model for persian language understanding
Liao et al. Improving readability for automatic speech recognition transcription
Baniata et al. A neural machine translation model for Arabic dialects that utilises multitask learning (MTL)
CN110914827A (en) Multi-language semantic parser based on transfer learning
Zhao et al. Automatic interlinear glossing for under-resourced languages leveraging translations
Garcia-Martinez et al. Addressing data sparsity for neural machine translation between morphologically rich languages
Dewangan et al. Experience of neural machine translation between Indian languages
Shi et al. Low-resource neural machine translation: Methods and trends
García-Martínez et al. Neural machine translation by generating multiple linguistic factors
De la Rosa et al. Transformers analyzing poetry: multilingual metrical pattern prediction with transfomer-based language models
Singh et al. Improving neural machine translation for low-resource Indian languages using rule-based feature extraction
Mahata et al. Simplification of English and Bengali sentences for improving quality of machine translation
Ehsan et al. Improving phrase chunking by using contextualized word embeddings for a morphologically rich language
Lyons A review of Thai–English machine translation
MohammadiBaghmolaei et al. TET: Text emotion transfer
Kaya et al. Effect of tokenization granularity for Turkish large language models
Oflazer et al. Turkish and its challenges for language and speech processing
Amri et al. Amazigh POS tagging using TreeTagger: a language independant model
Singh et al. English-Manipuri machine translation: an empirical study of different supervised and unsupervised methods
Rentschler et al. Data augmentation for intent classification of German conversational agents in the finance domain
Madasamy et al. Transfer learning based code-mixed part-of-speech tagging using character level representations for Indian languages
Turki Khemakhem et al. POS tagging without a tagger: using aligned corpora for transferring knowledge to under-resourced languages
Kaur et al. Roman to gurmukhi social media text normalization
Švec et al. Automatic correction of i/y spelling in Czech ASR output