Papadopoulos et al., 2013 - Google Patents
The IMPACT dataset of historical document imagesPapadopoulos et al., 2013
View PDF- Document ID
- 12518836209554630911
- Author
- Papadopoulos C
- Pletschacher S
- Clausner C
- Antonacopoulos A
- Publication year
- Publication venue
- Proceedings of the 2Nd international workshop on historical document imaging and processing
External Links
Snippet
Representative and comprehensive datasets are a prerequisite for any research activity, from studying specific types of problems through training of algorithms to evaluating results of actual implementations. This paper describes an invaluable resource which is the result of …
- 230000000694 effects 0 abstract description 6
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/21—Text processing
- G06F17/22—Manipulating or registering by use of codes, e.g. in sequence of text characters
- G06F17/2288—Version control
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/21—Text processing
- G06F17/211—Formatting, i.e. changing of presentation of document
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30011—Document retrieval systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/21—Text processing
- G06F17/24—Editing, e.g. insert/delete
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30244—Information retrieval; Database structures therefor; File system structures therefor in image databases
- G06F17/30247—Information retrieval; Database structures therefor; File system structures therefor in image databases based on features automatically derived from the image data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30861—Retrieval from the Internet, e.g. browsers
- G06F17/30876—Retrieval from the Internet, e.g. browsers by using information identifiers, e.g. encoding URL in specific indicia, browsing history
- G06F17/30879—Retrieval from the Internet, e.g. browsers by using information identifiers, e.g. encoding URL in specific indicia, browsing history by using bar codes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3061—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F17/30634—Querying
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30286—Information retrieval; Database structures therefor; File system structures therefor in structured data stores
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30017—Multimedia data retrieval; Retrieval of more than one type of audiovisual media
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30067—File systems; File servers
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30908—Information retrieval; Database structures therefor; File system structures therefor of semistructured data, the undelying structure being taken into account, e.g. mark-up language structure data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/00442—Document analysis and understanding; Document recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06Q—DATA PROCESSING SYSTEMS OR METHODS, SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/10—Office automation, e.g. computer aided management of electronic mail or groupware; Time management, e.g. calendars, reminders, meetings or time accounting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/20—Image acquisition
- G06K9/2054—Selective acquisition/locating/processing of specific regions, e.g. highlighted text, fiducial marks, predetermined fields, document type identification
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Papadopoulos et al. | The IMPACT dataset of historical document images | |
Antonacopoulos et al. | A realistic dataset for performance evaluation of document layout analysis | |
Neudecker et al. | OCR-D: An end-to-end open source OCR framework for historical printed documents | |
Clausner et al. | The ENP image and ground truth dataset of historical newspapers | |
Brunessaux et al. | The maurdor project: Improving automatic processing of digital documents | |
Terras | Inviting AI into the archives: the reception of handwritten recognition technology into historical manuscript transcription | |
Klijn et al. | The current state-of-art in newspaper digitization | |
Vidal-Gorène et al. | A modular and automated annotation platform for handwritings: evaluation on under-resourced languages | |
Choudhury et al. | A heuristic baseline method for metadata extraction from scanned electronic theses and dissertations | |
Boenig et al. | Labelling OCR Ground Truth for Usage in Repositories | |
Antonacopoulos et al. | The lifecycle of a digital historical document: structure and content | |
Ramel et al. | Interactive layout analysis, content extraction, and transcription of historical printed books using Pattern Redundancy Analysis | |
Hebert et al. | PIVAJ: displaying and augmenting digitized newspapers on the web experimental feedback from the" Journal de Rouen" collection | |
Olson et al. | Digitization decisions: comparing OCR software for librarian and archivist use | |
Bień | The IMPACT project Polish Ground-Truth texts as a DjVu corpus | |
WO2001013279A2 (en) | Word searchable database from high volume scanning of newspaper data | |
Baird et al. | Document analysis systems for digital libraries: Challenges and opportunities | |
Dulla | A dataset of warped historical arabic documents | |
Bartošek et al. | DML-CZ: the objectives and the first steps | |
Budig et al. | Glyph miner: a system for efficiently extracting glyphs from early prints in the context of OCR | |
Sojka | Digitization Workflow in the Czech Digital Mathematics Library | |
Yacoub et al. | Document digitization lifecycle for complex magazine collection | |
Rowberry | Digitizing the USPTO patent backfile | |
Matusiak et al. | A newspaper/periodical digitization project in Mongolia: Creating a digital archive of rare Mongolian publications | |
Tranouez et al. | PIVAJ: an article-centered platform for digitized newspapers |