Jurgens et al., 2017 - Google Patents
Incorporating dialectal variability for socially equitable language identificationJurgens et al., 2017
View PDF- Document ID
- 9742807769452134601
- Author
- Jurgens D
- Tsvetkov Y
- Jurafsky D
- Publication year
- Publication venue
- Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)
External Links
Snippet
Abstract Language identification (LID) is a critical first step for processing multilingual text. Yet most LID systems are not designed to handle the linguistic diversity of global platforms like Twitter, where local dialects and rampant code-switching lead language classifiers to …
- 230000036541 health 0 abstract description 9
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/28—Processing or translating of natural language
- G06F17/2809—Data driven translation
- G06F17/2827—Example based machine translation; Alignment
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/28—Processing or translating of natural language
- G06F17/289—Use of machine translation, e.g. multi-lingual retrieval, server side translation for client devices, real-time translation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2765—Recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/28—Processing or translating of natural language
- G06F17/2872—Rule based translation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2705—Parsing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3061—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/21—Text processing
- G06F17/22—Manipulating or registering by use of codes, e.g. in sequence of text characters
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30861—Retrieval from the Internet, e.g. browsers
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06Q—DATA PROCESSING SYSTEMS OR METHODS, SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Jurgens et al. | Incorporating dialectal variability for socially equitable language identification | |
Balahur et al. | Computational approaches to subjectivity and sentiment analysis: Present and envisaged methods and applications | |
Vyas et al. | Pos tagging of english-hindi code-mixed social media content | |
Abdul-Mageed et al. | Sana: A large scale multi-genre, multi-dialect lexicon for arabic subjectivity and sentiment analysis. | |
KR101130444B1 (en) | System for identifying paraphrases using machine translation techniques | |
Badaro et al. | EMA at SemEval-2018 task 1: Emotion mining for Arabic | |
Kausar et al. | ProSOUL: a framework to identify propaganda from online Urdu content | |
Chiril et al. | Multilingual and multitarget hate speech detection in tweets | |
Wang et al. | Classification for crisis-related tweets leveraging word embeddings and data augmentation | |
Parameswarappa et al. | Kannada word sense disambiguation using decision list | |
Smith et al. | Does ‘well-being’translate on Twitter? | |
Yusuf et al. | Sentiment analysis in low-resource settings: a comprehensive review of approaches, languages, and data sources | |
Onal et al. | Named entity recognition from scratch on social media | |
Onyenwe et al. | Toward an effective igbo part-of-speech tagger | |
Sinan Yüksel et al. | A real-time social network-based knowledge discovery system for decision making | |
Paul et al. | English to Nepali statistical machine translation system | |
pal Singh et al. | Naive Bayes classifier for word sense disambiguation of Punjabi language | |
Albogamy et al. | Unsupervised stemmer for Arabic tweets | |
Devi et al. | Steps of pre-processing for english to mizo smt system | |
Deep et al. | Machine translation system using deep learning for Punjabi to English | |
Kaur et al. | Punjabi dialects conversion system for Majhi, Malwai and Doabi dialects | |
Duong et al. | Measuring similarity for short texts on social media | |
Alsudais | Image classification in Arabic: exploring direct English to Arabic translations | |
Drymonas et al. | Opinion mapping travelblogs | |
Shams et al. | Development of a conceptual structure for a domain-specific corpus |