Moffat, 1989 - Google Patents
Word‐based text compressionMoffat, 1989
View PDF- Document ID
- 17694410984764712308
- Author
- Moffat A
- Publication year
- Publication venue
- Software: Practice and Experience
External Links
Snippet
The development of efficient algorithms to support arithmetic coding has meant that powerful models of text can now be used for data compression. Here the implementation of models based on recognizing and recording words is considered. Move‐to‐the‐front and several …
- 238000007906 compression 0 title abstract description 79
Classifications
-
- H—ELECTRICITY
- H03—BASIC ELECTRONIC CIRCUITRY
- H03M—CODING; DECODING; CODE CONVERSION IN GENERAL
- H03M7/00—Conversion of a code where information is represented by a given sequence or number of digits to a code where the same information or similar information or a subset of information is represented by a different sequence or number of digits
- H03M7/30—Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
- H03M7/40—Conversion to or from variable length codes, e.g. Shannon-Fano code, Huffman code, Morse code
- H03M7/42—Conversion to or from variable length codes, e.g. Shannon-Fano code, Huffman code, Morse code using table look-up for the coding or decoding process, e.g. using read-only memory
- H03M7/425—Conversion to or from variable length codes, e.g. Shannon-Fano code, Huffman code, Morse code using table look-up for the coding or decoding process, e.g. using read-only memory for the decoding process only
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30067—File systems; File servers
- G06F17/30129—Details of further file system functionalities
- G06F17/3015—Redundancy elimination performed by the file system
- G06F17/30153—Redundancy elimination performed by the file system using compression, e.g. sparse files
-
- H—ELECTRICITY
- H03—BASIC ELECTRONIC CIRCUITRY
- H03M—CODING; DECODING; CODE CONVERSION IN GENERAL
- H03M7/00—Conversion of a code where information is represented by a given sequence or number of digits to a code where the same information or similar information or a subset of information is represented by a different sequence or number of digits
- H03M7/30—Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
- H03M7/3084—Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction using adaptive string matching, e.g. the Lempel-Ziv method
- H03M7/3086—Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction using adaptive string matching, e.g. the Lempel-Ziv method employing a sliding window, e.g. LZ77
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30943—Information retrieval; Database structures therefor; File system structures therefor details of database functions independent of the retrieved data type
- G06F17/30946—Information retrieval; Database structures therefor; File system structures therefor details of database functions independent of the retrieved data type indexing structures
-
- H—ELECTRICITY
- H03—BASIC ELECTRONIC CIRCUITRY
- H03M—CODING; DECODING; CODE CONVERSION IN GENERAL
- H03M7/00—Conversion of a code where information is represented by a given sequence or number of digits to a code where the same information or similar information or a subset of information is represented by a different sequence or number of digits
- H03M7/30—Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
- H03M7/3084—Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction using adaptive string matching, e.g. the Lempel-Ziv method
- H03M7/3088—Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction using adaptive string matching, e.g. the Lempel-Ziv method employing the use of a dictionary, e.g. LZ78
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/21—Text processing
- G06F17/22—Manipulating or registering by use of codes, e.g. in sequence of text characters
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Moffat | Word‐based text compression | |
CA2263453C (en) | A lempel-ziv data compression technique utilizing a dictionary pre-filled with frequent letter combinations, words and/or phrases | |
EP0650264B1 (en) | Byte aligned data compression | |
US5532694A (en) | Data compression apparatus and method using matching string searching and Huffman encoding | |
CN1183683C (en) | Position adaptive coding method using prefix prediction | |
JP3083730B2 (en) | System and method for compressing data information | |
Valmeekam et al. | Llmzip: Lossless text compression using large language models | |
WO1998006028A9 (en) | A lempel-ziv data compression technique utilizing a dicionary pre-filled with fequent letter combinations, words and/or phrases | |
CN106202172B (en) | Text compression methods and device | |
WO2011007956A2 (en) | Data compression method | |
CN117216023B (en) | Large-scale network data storage method and system | |
US7253752B2 (en) | Coding apparatus, decoding apparatus, coding method, decoding method and program | |
JPS6356726B2 (en) | ||
Lea | Text compression with an associative parallel processor | |
Cooper et al. | Text compression using variable‐to fixed‐length encodings | |
Niemi et al. | Burrows‐Wheeler post‐transformation with effective clustering and interpolative coding | |
Katajainen et al. | An approximation algorithm for space-optimal encoding of a text | |
Martínez-Prieto et al. | Natural language compression on edge-guided text preprocessing | |
Nevill-Manning et al. | Phrase hierarchy inference and compression in bounded space | |
CN112615627A (en) | Dynamic compression method and dynamic compression system based on improved run length coding | |
CN112506876A (en) | Lossless compression query method supporting SQL query | |
JP4152491B2 (en) | Data alignment device and compression device | |
Adiego et al. | Mapping words into codewords on PPM | |
AU2021100433A4 (en) | A process for reducing execution time for compression techniques | |
CN1186987A (en) | Information compression method and device thereof |