Garcia-Martinez et al., 2020 - Google Patents
Addressing data sparsity for neural machine translation between morphologically rich languagesGarcia-Martinez et al., 2020
View PDF- Document ID
- 30746308892404921
- Author
- Garcia-Martinez M
- Aransa W
- Bougares F
- Barrault L
- Publication year
- Publication venue
- Machine Translation
External Links
Snippet
Translating between morphologically rich languages is still challenging for current machine translation systems. In this paper, we experiment with various neural machine translation (NMT) architectures to address the data sparsity problem caused by data availability …
- 230000001537 neural 0 title abstract description 22
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/28—Processing or translating of natural language
- G06F17/2809—Data driven translation
- G06F17/2827—Example based machine translation; Alignment
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3061—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F17/30634—Querying
- G06F17/30657—Query processing
- G06F17/3066—Query translation
- G06F17/30669—Translation of the query language, e.g. Chinese to English
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2765—Recognition
- G06F17/277—Lexical analysis, e.g. tokenisation, collocates
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2705—Parsing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3061—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F17/30634—Querying
- G06F17/30657—Query processing
- G06F17/30675—Query execution
- G06F17/30684—Query execution using natural language analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/28—Processing or translating of natural language
- G06F17/2872—Rule based translation
- G06F17/2881—Natural language generation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/28—Processing or translating of natural language
- G06F17/289—Use of machine translation, e.g. multi-lingual retrieval, server side translation for client devices, real-time translation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/21—Text processing
- G06F17/22—Manipulating or registering by use of codes, e.g. in sequence of text characters
- G06F17/2217—Character encodings
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2795—Thesaurus; Synonyms
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2785—Semantic analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/28—Processing or translating of natural language
- G06F17/2863—Processing of non-latin text
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Malmi et al. | Encode, tag, realize: High-precision text editing | |
Farahani et al. | Parsbert: Transformer-based model for persian language understanding | |
Liao et al. | Improving readability for automatic speech recognition transcription | |
Baniata et al. | A neural machine translation model for Arabic dialects that utilises multitask learning (MTL) | |
CN110914827A (en) | Multi-language semantic parser based on transfer learning | |
Zhao et al. | Automatic interlinear glossing for under-resourced languages leveraging translations | |
Garcia-Martinez et al. | Addressing data sparsity for neural machine translation between morphologically rich languages | |
Dewangan et al. | Experience of neural machine translation between Indian languages | |
Shi et al. | Low-resource neural machine translation: Methods and trends | |
García-Martínez et al. | Neural machine translation by generating multiple linguistic factors | |
De la Rosa et al. | Transformers analyzing poetry: multilingual metrical pattern prediction with transfomer-based language models | |
Singh et al. | Improving neural machine translation for low-resource Indian languages using rule-based feature extraction | |
Mahata et al. | Simplification of English and Bengali sentences for improving quality of machine translation | |
Ehsan et al. | Improving phrase chunking by using contextualized word embeddings for a morphologically rich language | |
Lyons | A review of Thai–English machine translation | |
MohammadiBaghmolaei et al. | TET: Text emotion transfer | |
Kaya et al. | Effect of tokenization granularity for Turkish large language models | |
Oflazer et al. | Turkish and its challenges for language and speech processing | |
Amri et al. | Amazigh POS tagging using TreeTagger: a language independant model | |
Singh et al. | English-Manipuri machine translation: an empirical study of different supervised and unsupervised methods | |
Rentschler et al. | Data augmentation for intent classification of German conversational agents in the finance domain | |
Madasamy et al. | Transfer learning based code-mixed part-of-speech tagging using character level representations for Indian languages | |
Turki Khemakhem et al. | POS tagging without a tagger: using aligned corpora for transferring knowledge to under-resourced languages | |
Kaur et al. | Roman to gurmukhi social media text normalization | |
Švec et al. | Automatic correction of i/y spelling in Czech ASR output |