[go: up one dir, main page]

Jurgens et al., 2017 - Google Patents

Incorporating dialectal variability for socially equitable language identification

Jurgens et al., 2017

View PDF
Document ID
9742807769452134601
Author
Jurgens D
Tsvetkov Y
Jurafsky D
Publication year
Publication venue
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

External Links

Snippet

Abstract Language identification (LID) is a critical first step for processing multilingual text. Yet most LID systems are not designed to handle the linguistic diversity of global platforms like Twitter, where local dialects and rampant code-switching lead language classifiers to …
Continue reading at aclanthology.org (PDF) (other versions)

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • G06F17/28Processing or translating of natural language
    • G06F17/2809Data driven translation
    • G06F17/2827Example based machine translation; Alignment
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • G06F17/28Processing or translating of natural language
    • G06F17/289Use of machine translation, e.g. multi-lingual retrieval, server side translation for client devices, real-time translation
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • G06F17/27Automatic analysis, e.g. parsing
    • G06F17/2765Recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • G06F17/28Processing or translating of natural language
    • G06F17/2872Rule based translation
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • G06F17/27Automatic analysis, e.g. parsing
    • G06F17/2705Parsing
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/3061Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • G06F17/21Text processing
    • G06F17/22Manipulating or registering by use of codes, e.g. in sequence of text characters
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/30861Retrieval from the Internet, e.g. browsers
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06QDATA PROCESSING SYSTEMS OR METHODS, SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints

Similar Documents

Publication Publication Date Title
Jurgens et al. Incorporating dialectal variability for socially equitable language identification
Balahur et al. Computational approaches to subjectivity and sentiment analysis: Present and envisaged methods and applications
Vyas et al. Pos tagging of english-hindi code-mixed social media content
Abdul-Mageed et al. Sana: A large scale multi-genre, multi-dialect lexicon for arabic subjectivity and sentiment analysis.
KR101130444B1 (en) System for identifying paraphrases using machine translation techniques
Badaro et al. EMA at SemEval-2018 task 1: Emotion mining for Arabic
Kausar et al. ProSOUL: a framework to identify propaganda from online Urdu content
Chiril et al. Multilingual and multitarget hate speech detection in tweets
Wang et al. Classification for crisis-related tweets leveraging word embeddings and data augmentation
Parameswarappa et al. Kannada word sense disambiguation using decision list
Smith et al. Does ‘well-being’translate on Twitter?
Yusuf et al. Sentiment analysis in low-resource settings: a comprehensive review of approaches, languages, and data sources
Onal et al. Named entity recognition from scratch on social media
Onyenwe et al. Toward an effective igbo part-of-speech tagger
Sinan Yüksel et al. A real-time social network-based knowledge discovery system for decision making
Paul et al. English to Nepali statistical machine translation system
pal Singh et al. Naive Bayes classifier for word sense disambiguation of Punjabi language
Albogamy et al. Unsupervised stemmer for Arabic tweets
Devi et al. Steps of pre-processing for english to mizo smt system
Deep et al. Machine translation system using deep learning for Punjabi to English
Kaur et al. Punjabi dialects conversion system for Majhi, Malwai and Doabi dialects
Duong et al. Measuring similarity for short texts on social media
Alsudais Image classification in Arabic: exploring direct English to Arabic translations
Drymonas et al. Opinion mapping travelblogs
Shams et al. Development of a conceptual structure for a domain-specific corpus