Volk, 1999 - Google Patents
Choosing the right lemma when analysing German nounsVolk, 1999
View PDF- Document ID
- 12812578208147423444
- Author
- Volk M
- Publication year
External Links
Snippet
1. Introduction When processing large corpora, it is often necessary to lemmatise the wordforms. This is usually done by a morphological analyser which can, in any case, undo inflection but sometimes even derivation and compounding. The latter is especially useful for …
- 150000001875 compounds 0 abstract description 9
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2765—Recognition
- G06F17/277—Lexical analysis, e.g. tokenisation, collocates
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2765—Recognition
- G06F17/2775—Phrasal analysis, e.g. finite state techniques, chunking
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/274—Grammatical analysis; Style critique
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2795—Thesaurus; Synonyms
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2705—Parsing
- G06F17/2715—Statistical methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3061—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F17/30634—Querying
- G06F17/30657—Query processing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/28—Processing or translating of natural language
- G06F17/2863—Processing of non-latin text
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/28—Processing or translating of natural language
- G06F17/2809—Data driven translation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/273—Orthographic correction, e.g. spelling checkers, vowelisation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/21—Text processing
- G06F17/22—Manipulating or registering by use of codes, e.g. in sequence of text characters
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Wacholder et al. | Disambiguation of proper names in text | |
Wu et al. | Large-scale automatic extraction of an English-Chinese translation lexicon | |
KR940022316A (en) | Keyword Extractor for Japanese Documents | |
US8170867B2 (en) | System for extracting information from a natural language text | |
Dasgupta et al. | Unsupervised morphological parsing of Bengali | |
Wermter et al. | Collocation extraction based on modifiability statistics | |
Chanod et al. | Creating a tagset, lexicon and guesser for a French tagger | |
Mima et al. | An application and e aluation of the C/NC-value approach for the automatic term recognition of multi-word units in Japanese | |
Uchimoto et al. | The unknown word problem: a morphological analysis of Japanese using maximum entropy aided by a dictionary | |
US7376551B2 (en) | Definition extraction | |
Nagy et al. | Detecting light verb constructions across languages | |
Volk | Choosing the right lemma when analysing German nouns | |
Fung | Extracting key terms from Chinese and Japanese texts | |
Wechsler et al. | Multi-language text indexing for internet retrieval | |
Stolz et al. | When some dots turn a different color…: Thoughts on how (not) to determine whether or not reduplication is universal | |
Evert et al. | Identifying Morphosyntactic Preferences in Collocations. | |
Weller et al. | Extraction of German Multiword Expressions from Parsed Corpora Using Context Features. | |
Laporte | Reduction of lexical ambiguity | |
Martin | Choosing the right lemma when analysing German nouns | |
Borst et al. | Lexically-based distinction of readability levels of health documents | |
Dimitrov et al. | A lightweight approach to coreference resolution for named entities in text | |
Golcher | Statistical text segmentation with partial structure analysis | |
Nakov | Building an inflectional stemmer for Bulgarian. | |
Litkowski | Use of machine readable dictionaries for word-sense disambiguation in senseval-2 | |
Tufiş et al. | Lexical token alignment: Experiments, results and applications |