[go: up one dir, main page]

Moffat, 1989 - Google Patents

Word‐based text compression

Moffat, 1989

View PDF
Document ID
17694410984764712308
Author
Moffat A
Publication year
Publication venue
Software: Practice and Experience

External Links

Snippet

The development of efficient algorithms to support arithmetic coding has meant that powerful models of text can now be used for data compression. Here the implementation of models based on recognizing and recording words is considered. Move‐to‐the‐front and several …
Continue reading at www.researchgate.net (PDF) (other versions)

Classifications

    • HELECTRICITY
    • H03BASIC ELECTRONIC CIRCUITRY
    • H03MCODING; DECODING; CODE CONVERSION IN GENERAL
    • H03M7/00Conversion of a code where information is represented by a given sequence or number of digits to a code where the same information or similar information or a subset of information is represented by a different sequence or number of digits
    • H03M7/30Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
    • H03M7/40Conversion to or from variable length codes, e.g. Shannon-Fano code, Huffman code, Morse code
    • H03M7/42Conversion to or from variable length codes, e.g. Shannon-Fano code, Huffman code, Morse code using table look-up for the coding or decoding process, e.g. using read-only memory
    • H03M7/425Conversion to or from variable length codes, e.g. Shannon-Fano code, Huffman code, Morse code using table look-up for the coding or decoding process, e.g. using read-only memory for the decoding process only
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/30067File systems; File servers
    • G06F17/30129Details of further file system functionalities
    • G06F17/3015Redundancy elimination performed by the file system
    • G06F17/30153Redundancy elimination performed by the file system using compression, e.g. sparse files
    • HELECTRICITY
    • H03BASIC ELECTRONIC CIRCUITRY
    • H03MCODING; DECODING; CODE CONVERSION IN GENERAL
    • H03M7/00Conversion of a code where information is represented by a given sequence or number of digits to a code where the same information or similar information or a subset of information is represented by a different sequence or number of digits
    • H03M7/30Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
    • H03M7/3084Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction using adaptive string matching, e.g. the Lempel-Ziv method
    • H03M7/3086Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction using adaptive string matching, e.g. the Lempel-Ziv method employing a sliding window, e.g. LZ77
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/30943Information retrieval; Database structures therefor; File system structures therefor details of database functions independent of the retrieved data type
    • G06F17/30946Information retrieval; Database structures therefor; File system structures therefor details of database functions independent of the retrieved data type indexing structures
    • HELECTRICITY
    • H03BASIC ELECTRONIC CIRCUITRY
    • H03MCODING; DECODING; CODE CONVERSION IN GENERAL
    • H03M7/00Conversion of a code where information is represented by a given sequence or number of digits to a code where the same information or similar information or a subset of information is represented by a different sequence or number of digits
    • H03M7/30Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
    • H03M7/3084Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction using adaptive string matching, e.g. the Lempel-Ziv method
    • H03M7/3088Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction using adaptive string matching, e.g. the Lempel-Ziv method employing the use of a dictionary, e.g. LZ78
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • G06F17/21Text processing
    • G06F17/22Manipulating or registering by use of codes, e.g. in sequence of text characters

Similar Documents

Publication Publication Date Title
Moffat Word‐based text compression
CA2263453C (en) A lempel-ziv data compression technique utilizing a dictionary pre-filled with frequent letter combinations, words and/or phrases
EP0650264B1 (en) Byte aligned data compression
US5532694A (en) Data compression apparatus and method using matching string searching and Huffman encoding
CN1183683C (en) Position adaptive coding method using prefix prediction
JP3083730B2 (en) System and method for compressing data information
Valmeekam et al. Llmzip: Lossless text compression using large language models
WO1998006028A9 (en) A lempel-ziv data compression technique utilizing a dicionary pre-filled with fequent letter combinations, words and/or phrases
CN106202172B (en) Text compression methods and device
WO2011007956A2 (en) Data compression method
CN117216023B (en) Large-scale network data storage method and system
US7253752B2 (en) Coding apparatus, decoding apparatus, coding method, decoding method and program
JPS6356726B2 (en)
Lea Text compression with an associative parallel processor
Cooper et al. Text compression using variable‐to fixed‐length encodings
Niemi et al. Burrows‐Wheeler post‐transformation with effective clustering and interpolative coding
Katajainen et al. An approximation algorithm for space-optimal encoding of a text
Martínez-Prieto et al. Natural language compression on edge-guided text preprocessing
Nevill-Manning et al. Phrase hierarchy inference and compression in bounded space
CN112615627A (en) Dynamic compression method and dynamic compression system based on improved run length coding
CN112506876A (en) Lossless compression query method supporting SQL query
JP4152491B2 (en) Data alignment device and compression device
Adiego et al. Mapping words into codewords on PPM
AU2021100433A4 (en) A process for reducing execution time for compression techniques
CN1186987A (en) Information compression method and device thereof