[go: up one dir, main page]

Papadopoulos et al., 2013 - Google Patents

The IMPACT dataset of historical document images

Papadopoulos et al., 2013

View PDF
Document ID
12518836209554630911
Author
Papadopoulos C
Pletschacher S
Clausner C
Antonacopoulos A
Publication year
Publication venue
Proceedings of the 2Nd international workshop on historical document imaging and processing

External Links

Snippet

Representative and comprehensive datasets are a prerequisite for any research activity, from studying specific types of problems through training of algorithms to evaluating results of actual implementations. This paper describes an invaluable resource which is the result of …
Continue reading at www.primaresearch.org (PDF) (other versions)

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • G06F17/21Text processing
    • G06F17/22Manipulating or registering by use of codes, e.g. in sequence of text characters
    • G06F17/2288Version control
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • G06F17/21Text processing
    • G06F17/211Formatting, i.e. changing of presentation of document
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/30011Document retrieval systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • G06F17/21Text processing
    • G06F17/24Editing, e.g. insert/delete
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/30244Information retrieval; Database structures therefor; File system structures therefor in image databases
    • G06F17/30247Information retrieval; Database structures therefor; File system structures therefor in image databases based on features automatically derived from the image data
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/30861Retrieval from the Internet, e.g. browsers
    • G06F17/30876Retrieval from the Internet, e.g. browsers by using information identifiers, e.g. encoding URL in specific indicia, browsing history
    • G06F17/30879Retrieval from the Internet, e.g. browsers by using information identifiers, e.g. encoding URL in specific indicia, browsing history by using bar codes
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/3061Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F17/30634Querying
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/30286Information retrieval; Database structures therefor; File system structures therefor in structured data stores
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/30017Multimedia data retrieval; Retrieval of more than one type of audiovisual media
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/30067File systems; File servers
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/30908Information retrieval; Database structures therefor; File system structures therefor of semistructured data, the undelying structure being taken into account, e.g. mark-up language structure data
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/00442Document analysis and understanding; Document recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06QDATA PROCESSING SYSTEMS OR METHODS, SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/10Office automation, e.g. computer aided management of electronic mail or groupware; Time management, e.g. calendars, reminders, meetings or time accounting
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/20Image acquisition
    • G06K9/2054Selective acquisition/locating/processing of specific regions, e.g. highlighted text, fiducial marks, predetermined fields, document type identification

Similar Documents

Publication Publication Date Title
Papadopoulos et al. The IMPACT dataset of historical document images
Antonacopoulos et al. A realistic dataset for performance evaluation of document layout analysis
Neudecker et al. OCR-D: An end-to-end open source OCR framework for historical printed documents
Clausner et al. The ENP image and ground truth dataset of historical newspapers
Brunessaux et al. The maurdor project: Improving automatic processing of digital documents
Terras Inviting AI into the archives: the reception of handwritten recognition technology into historical manuscript transcription
Klijn et al. The current state-of-art in newspaper digitization
Vidal-Gorène et al. A modular and automated annotation platform for handwritings: evaluation on under-resourced languages
Choudhury et al. A heuristic baseline method for metadata extraction from scanned electronic theses and dissertations
Boenig et al. Labelling OCR Ground Truth for Usage in Repositories
Antonacopoulos et al. The lifecycle of a digital historical document: structure and content
Ramel et al. Interactive layout analysis, content extraction, and transcription of historical printed books using Pattern Redundancy Analysis
Hebert et al. PIVAJ: displaying and augmenting digitized newspapers on the web experimental feedback from the" Journal de Rouen" collection
Olson et al. Digitization decisions: comparing OCR software for librarian and archivist use
Bień The IMPACT project Polish Ground-Truth texts as a DjVu corpus
WO2001013279A2 (en) Word searchable database from high volume scanning of newspaper data
Baird et al. Document analysis systems for digital libraries: Challenges and opportunities
Dulla A dataset of warped historical arabic documents
Bartošek et al. DML-CZ: the objectives and the first steps
Budig et al. Glyph miner: a system for efficiently extracting glyphs from early prints in the context of OCR
Sojka Digitization Workflow in the Czech Digital Mathematics Library
Yacoub et al. Document digitization lifecycle for complex magazine collection
Rowberry Digitizing the USPTO patent backfile
Matusiak et al. A newspaper/periodical digitization project in Mongolia: Creating a digital archive of rare Mongolian publications
Tranouez et al. PIVAJ: an article-centered platform for digitized newspapers