CN1082784A

CN1082784A - Utilize the data compression system of source representation

Info

Publication number: CN1082784A
Application number: CN93109259.0A
Authority: CN
Inventors: 小·L·E·莱特; E·G·基米
Original assignee: REDBAND TECHNOLOGIES Inc
Current assignee: REDBAND TECHNOLOGIES Inc
Priority date: 1992-08-03
Filing date: 1993-08-03
Publication date: 1994-02-23
Also published as: MX9304659A; AU4793093A; IL106335A0; TW256970B; CA2141669A1; EP0653125A1; WO1994003984A1; JPH07509824A

Abstract

The present invention relates to a kind of method and apparatus of data compression, this method and apparatus comprises the processor of one or more maximal sequences that is used for selecting to represent at the receiver and being used to that input receives a serial data of the information that representative will send at least a portion of this serial data.A transmitter is provided, is used for sending the one or more maximal sequences of output, so that form deal with data to compress.

Description

Utilize the data compression system of source representation

The application is that common pending application number is No.300037, the part continuation application of existing U.S. Patent No. 5136618 of authorizing on August 4th, 1992.

The present invention relates to a kind of system that is used to compress or transmit numeral or analogue data sequence with high compactness.

The known data of people transmit, such as phone and electric wire, and wireless and microwave link, optical fiber and similar circuit all have the limited capacity that transmits data.Thereby correspondingly limited the ability of the transmitter and the receiver of deal with data.As a general example, many offices all have facsimile machine at present, this facsimile machine can be sent to the reception machine that is positioned at the another one place with numerical data (the bright and dark zone by the file that will send is transformed into sequence of data bits, and promptly many 1 and 0 sequence obtains) by telephone line.Since the metal telephone wire known physically with electric on all restrictions, data can not be transmitted faster than a certain speed.

In addition, be contained in the numerical data that the modulator-demodulator reality in the facsimile machine delivers to numerical data on the telephone line and also on its transmission rate, maximum limit arranged.No matter facsimile machine can be a digital form with speed conversion how soon with a file, if its inner modulator-demodulator can not be being higher than, such as the speed of 2400 bits per seconds sends, and then message is to send and to receive with a kind of fast speeds.At last, facsimile machine can on which speed scanning document, also just having limited message can send on which speed.

A kind of method of improving transfer rate is the machine that manufacturing itself has two-forty.If, facsimile machine can be enough fast the velocity scanning file, if its modulator-demodulator can be with per second 4800 bits, rather than 2400 bit transmit, can be if the transmission coal is situated between with this rate transmissioning data, if also can receive with this speed with receiving system, then transmission speed just can double.

The other method of improving transmission speed is to utilize a kind of wider bandwidth that has, and the transmission coal of promptly a kind of higher transmission data capability or capacity is situated between.For example the form of the light pulse of optical fiber transmit data and for this reason optical fiber compare with traditional copper cash with electrical signal form transmission data, have very big data capacity.

A kind of method that also has in addition that increases transmission speed is the data that compression will transmit.Still with facsimile machine as example, suppose that someone wants to transmit envelope letter, and this letter only has ten capable typewritten texts in the first half of one page standard blank sheet of paper.In common facsimile machine, this part file scans with a large amount of lines, and every line comprises a large amount of adjacent point or image cells (pixel).For the purpose of this example, suppose that scanner scans this document with continuous 1000 row and every line is made up of 1000 pixels; In other words, the scanner of supposing this facsimile machine is decomposed into 1,000,000 points of 1000 * 1000=1 with a page file.

In scan period, each pixel gives assignment as follows as an example, if promptly pixel is almost deceived, then assignment is " 1 "; If almost be white, then assignment is " 0 ".(more advanced facsimile machine utilizes different schemes and many facsimile machines even represents each pixel with one of several gray tones.) in this example, so this page can be represented by a sequence of 1,000,000 bits many 1 and 0.Yet for the actual texts that ten row are only arranged, considerably less point is deceived and be we can say that institute's data sequence that passes of at least 99% will be made up of zero; Specifically, the latter half of this page does not contain text fully, but still needs complete zero of 500,000 bits to represent it.

A kind of in this example transmission method will be to transmit their 1,000,000 all bits.From another point of view, this will lose time and the memory space of data, because the major part of transmission will contain garbage (some bits of for example representing text are useful information).The shared time of bit of 500,000 second pages of transmission has been wasted, and transmits because work as these bits, and the transmission coal is situated between, and the message that can not be used further to other such as telephone line transmits.

People can have certain structure by the data sequence that observation will transmit and reduce useless transmission.In this example, people can notice most " 1 " and particularly " 0 ", and the form with long string occurs often, rather than at random.In addition, the latter half of this page file is by single a string 500,000 zero representatives.

According to the known data compression scheme that is used for facsimile machine of people, this feature structure of the many files of being faxed of this scheme identification, people do not transmit " 1 " and " 0 " of the reality of representing this document, nor transmit the length of current Bit String.For example, replace 10 " 1 ", and then 10 " 0 ", and then 5 " 1 ", the and then transmission of the sequence of 500,000 " 0 ", people can transmit

numeral

10,10,5 and 500,000 simply.By the machine 1 of telling reception be with " black " string be beginning, change in color with each new digitized representation string, people can only transmit four numerals (these four numerals can only be come digitized representations by tens bits) and replace 500 like this, 000+10+10+5=500, the string of 025 bit.

After the receiver of facsimile machine had received this bit string (10,10,5,500,000), this receiver was printed 10 stains, then went to and printed 10 white points, then went to and printed 5 stains, went at last again and printed 500,000 white points.By taking the advantage of legacy data string structure, so people can include identical information in very short bit string.And then, this scheme allow people with less expensive and efficiently mode send identical message, and lower transmission coal Jie's of tolerance capacity.

Receiver can receive and store this bit string (10,10,5,500,000) very fast, even in fact receiver will print the file of being faxed with long many time.Yet the shortcoming of this data compression scheme is that the high compression that this scheme realizes is only worked as the unconverted binary digit that the data that transmitted contain many long strings.If being complete black (1,000,000 binary ones) or complete white (1,000,000 binary zeroes), the page or leaf that this transmitted to realize the highest degree of compression, because this people only must transmit a numeral (1,000,000), just fully reappears the full text of this document.

On the other hand, if original document comprises the figure of the criss-cross arrangement up and down of black and white pixel alternate, then this scheme does not realize compression fully, in this case, this system must send 1, the bit string of 000,000 " 1 " (each string only behind long and each pixel of pixel all change in color once).Can not simplify binary bits for this reason, but the data block of 1,000,000 bit (each length is enough to represent the longest number that may transmit 1000,000), in fact this scheme goes to transmit the data flow of being somebody's turn to do " compressing " with the very long time.In order to make system effective, this compressibility requires the data of input itself to have certain structure.

One type of the system that the example conduct of given facsimile machine is compressed is useful, but may can both not realize desired result in all cases.The similar problem of data compression and efficient also can be run in many other technologies field, for example, and the speech transmissions of digital form, Digital Television and information some other fields that will be transmitted therein with a kind of form of Serial No..Because the overwhelming majority of message transmission, such as realizing (by transmitting binary number) by modern load mode (satellite, optical fiber or the like) with digital form, the purpose that increases capacity by data compression all can find aspect many in the Modern Telecommunication technology.

Therefore, have a kind of needs of data compression system, this system provides a higher ability for transmission or deal with data, and needs a kind of system, and this system provides the ability of deal with data quickly, thereby causes more data or information to be passed through.The needs that also have a kind of system and method, this system and method can provide the data compression ability higher than former technology.

Thereby, an object of the present invention is to provide a kind of data communications system that is used for, this system reaches the higher data compression degree in the cards than present system.

A further object of the invention provides a kind of method and apparatus that is used for deal with data or other information, this method and apparatus has the ability of the higher deal with data that can realize such as preceding technology, and faster than the speed of former method and apparatus deal with data.

Another object of the present invention provides a kind of method and apparatus, and this method and apparatus can allow higher data or throughput for various data and messaging device.

A further object of the invention provides a kind of method and apparatus, and this method and apparatus is used to increase the data and the data capacity and the speed that can be sent to another computer from a computer by Computer Processing.

According to the present invention, a kind of data compression system and method are provided, and this system and method allows the processing and the transmission of data to have higher capacity, data processing quickly, provide higher data compression with therefore higher throughput and this system and method, and do not lose completely.Data compression system is provided with a device, is used for being received in the serial data that input is represented the information that will be transmitted.Has the data sequence that some system equations of being used for selecting to represent the device of one or more label label of a part of this serial data at least, this choice device to comprise and having corresponding each label and this system equation produce numeral.Also has the device that is used to transmit these labels.

In another form of the present invention, data compression system comprises the device that is used for one or more equations, represents one or more families of orthogonal basis to represent at least a portion of this serial data, and wherein each equation has a corresponding unique label.Also have and be used for and be sent to the device of output corresponding to one or more labels of selected each equation.

In other preferred form of the present invention, the device that is used to select comprises the device that is used to select one or more maximal sequences.For example, provide a data compressibility, this system comprises the device of the serial data that is used to receive the information that representative will transmit.Also have and be used to select to represent the device of one or more maximal sequences of the part of this serial data at least.Also has the device that is used for the expression formula of one or more maximal sequences is sent to an output.

A kind of one group of method that will be used for the algorithm of data compression system that is used to constitute has also been described, such as the form that can be a plurality of finite state machines here.This method comprises the step of determining a set of equations, and wherein the separate equation in this set of equations is orthogonal, with the orthogonal basis that forms at least a portion that will be used to represent a data sequence and keep this group orthogonal basis.This method also is included as the step that each base distributes a unique label, and wherein each label list is shown in formed corresponding each base in the step of determining separate equation.For example in a preferred embodiment, the step of determining a set of equations comprises the step of determining at least one maximal sequence.The step that should determine at least one maximal sequence can comprise the N-1 cycle shift of the maximal sequence of the maximal sequence of determining at least one length N and length N.

In other preferred form of the present invention, an encode processor will be divided into the data block of a N binary bits by the data sequence that the source produces, and this data block then is transformed to the data block of N numeral again.This N numeral is equivalent to the value of counting of this binary bits or number.Value N preferably is chosen as the Mersenne prime number of suitable size, thereby can compress effectively, and provides acceptable to calculate fast.Encoder then calculates the maximal sequence of length N for each data block.This maximal sequence adds its N-1 cycle shift, the base of a N vector.What " the 0th " individual bases are this system can also provide by will be shifted the number " i " that obtains different base vector regulations.

Then N vector or base for example can be stored in one or more processors, such as sending computer and receiving computer or be fit to store in the computer of packed data, and distribute unique label for use in the transmission data.Under the situation of transmission and receiving computer, conversion input data find maximal sequence, select one or more basic N-1 cycle shift to represent the part of these data.The label that then will be used for maximal sequence is sent to receiving computer with relevant cycle shift, can not have this identical data block of reproduction of losing with this receiving computer.In addition on the one hand, system sends the label and basic sign (the displacement degree of " the 0th " base) that is used for maximal sequence.Therefore, by determining a suitable maximal sequence, system can increase the capacity by the system handles data in this example, thereby has increased speed and the throughput that has increased data or other information by the system handles data.This system provides quite high amount of data compression.This method and system can be used in the very wide range of application that comprises the facsimile machine of above having discussed.

People will notice, an aspect of facsimile machine example, utilize described data compression scheme, people no longer transmit the serial data of the reality of the original message of representative (digitized file), make reception machine itself how can rebuild the plans (blueprint) of former message but transmit.People transmit the description of this original data flow feature, rather than this data flow itself.

According to the present invention, replace the data sequence that transmits a kind of shortening (compression) mode, constitute " element type data block " (building blocks) or " data generator " (data generators) of serial data according to system of the present invention, this system can be combined to form the pattern of this data source, when this data source is activated, produces people and wish the identical data sequence that transmits.Replace to transmit data, people can transmit a kind of information, and this information tells receiver to make up its data generator as this a kind of mode how, makes this receiver oneself produce this data sequence again." data generator " can be described as being referred to as " finite state machine " with mathematical way, it can utilize one or more processors to realize.The various characteristics of these finite state machines is predefined and is stored in transmitter and the receiver.System and this list entries of post analysis are determined the sequence that constituted, and finite state machine will produce the list entries of these data.Another situation, the various characteristics of this finite state machine produces in the time of can working as system initialization.Therefore, code or the instruction that is used to produce each characteristic of this finite state machine can be sent to receiver.Then transmitter and receiver similarly by configuration and then transmitter can send some and discern which finite state machine, the identification signal of finite state machine combination is input to reconstruction the data sequence in the transmitter at last.

Substitute the feature that sends sequence itself or relevant sequence, replace the identification signal that sends finite state itself or its combination, the list entries of just can regenerating according to transmitter of the present invention; This transmission requires the less data bit.Then receiver can start this finite state machine, makes it operation, removes the playback of data list entries.Therefore, the present invention has developed the model or the expression in initial data source, so this model does not have the desired data sequence of regeneration with losing in receiver.

Fig. 1 has schematically showing of two data processing units, and they can be used for data compression of the present invention.

Fig. 2 is in conjunction with the signal of the transmitter of various aspects of the present invention and receiver and block diagram and has described according to method of the present invention.

Fig. 3 A and 3B describe according to the inventive method to analyze respectively, encode, comprehensively and the flow chart of decoding entire method.

According to the present invention, the data compression system of describing on the top of Fig. 2 can be realized by multiple should being used for.Transmission that Fig. 1 describes and process computer 2 contain the data compression system according to reception of the present invention and packed data, with the maximal sequence of the one or more representative serial data at least a portion selected by data compression system of output to modulator-demodulator 3, transmit these maximal sequences to the receiving modem 4 that is linked with receiving computer 5, contain the promising comprehensive and decode system that from each maximal sequence of receiving by sending computer 2 and other information, receives each maximal sequence and integrated data sequence in the receiving computer 5.

The described system of Fig. 1 is the expression of a kind of equipment of simplifying, and this system adopts the present invention valuably, realizes easier and deal with data and information quickly.The present invention includes facsimile machine and use, transmission of digital form and speech, Digital Television or record and similar other application are that the imagination is come out easily.

Before deeply inquiring into practical structures of the present invention and operating, the notion of understanding source representation earlier is beneficial.Fig. 2 represents a kind of very big simplification back notion of telecommunication system.In Fig. 2, transmitter 10 is attempted to send data to receiver 14 by transmission coal Jie 12.Coal Jie 12 can be electric wire, electromagnetic radiation, optical fiber or the like.

Transmitter comprises, perhaps more generally speaking, is to be connected to data source 16, and this data source produces the output signal that can be transformed to digital form.Data source for example may be: document scanner, speech digitizer, contain people and wish mechanical, electrical gamma camera, remote measurement or satellite scanning device and any countless potential modern comfort that can produce digital form information looked of the data computing that transmits.This numerical data (numerical data) for example can be the Serial No. from initial numberical data (primary digital data), or the discrete form of analogue data.In illustrated example, source 16 produces 14 binary digital numeric strings.Transmitter also comprises an encoder 17, and its function is described in the back.

Receiver 14 comprise controller or processor 18 and N digital network or finite state machine (label be FSM1, FSM2 ... FSMN).Each finite state machine with binary digital Finite Number word string form produce unique output signal X1, an X2 ... XN.When for example FSM1 is activated, produce 1100101 sequences.The digital electronics field that is formed in of finite state machine is known, for example at publication Digital Networks and Computer Systems, Taylor L.Booth, John Wiley and Sons, Inc, 1971 have discussed.What this piece argumentation was interesting is the binary digital unique numeric string of fixed number that the FSM1-FSMN parts produce.

Controller 18 is connected to each finite state machine FSM1-FSMN and receives the serial data that is produced from each state machine through incoming line I1-IN through output line A1-AN.Controller 18 can be selected any or all of state machine and their output signal of combination.In this, people should remember binary system, logical operator with (AND) and or (OR).Provide two binary string X and Y, people form X and Y by their independent numeral to multiplying each other.Therefore, 101 and (AND) 001 is 001; In other words, if all be 1 only, be only 1 in each numeral of result in the respective digital of the binary string of two inputs.For forming X or (OR) for the Y, only when two binary strings of input on the corresponding position the two one of them or the two all be that formed binary string as a result just all is 1 under 1 the situation.Therefore, 101 or (or) 011 is 111.

Fig. 2 example in, source 16 has produced a binary digital output string of 14 of will be transmitted.As mentioned above, people may simplify these 14 bits of transmission, but the compression of may being unrealized.We also may attempt such as certain encoding scheme that transmits binary string length, but in this case, because people can't suppose the length of no change string, even a kind of like this scheme can not increase efficiency of transmission.

On the other hand, suppose to contain in the encoder 17 information of the output string that produces relevant for state machine FSM1-FSMN.Encoder at first is divided into source output string each data block B with 7 bits (identical with the length of the output string of state machine) ₁And B ₂Then encoder can be analyzed data block B ₁In fact can pass through to form X with definite this data block ₁With (AND) X ₂Obtain, i.e. (1100101 AND 1000110)=1000100=B1.Equally, encoder can be determined B2=(X2 OR XN).

Replace to send whole 14 Bit Strings, so transmitter can send the signal of the weak point of which or which state machine that identification selected by controller 18 simply, which operation same short signal indication will carry out in their output signal.Then, after carrying out appropriate combination, controller 18 enable state machine operations, so as in receiver the renewable source output sequence.In essence, some state machines by suitably combination, have played the effect in source itself, promptly in receiver the source simulated and all transmission all be the required information of this model of composition.

Finite state machine as shown in Figure 2 only is the expression that can be used in the structured sort of representing a data source.In some applications, having a single state machine may be just enough.This state machine depends on that controller gives its comfortable initial value, and produces different output strings.In other type, each has the dateout sequence of one group of their generation each state machine, but each sequence can produce with different periodicity order; For example, the output of a state machine is ABC, by to each alphabetical right shift one step and cyclic shift (Wrapping around), may produce CAB and BCA, and a letter in a sequence becomes first letter of next sequence like this.(utilize letter just for cause clearly, the string of binary number can cyclic shift with the same manner).

Utilize example shown in Figure 2, it is not always tangible that output by assembled state machine FSM1-FSMN or other operations will have the model that can mould all possible source 16, will produce the finite state machine of the limited output of desired limited output though almost always might construct.For example, suppose only have the state machine that produces X1=100 and X2=010.Do not have " with " or the simple combination of inclusive-OR operation can be created in of the output of the 3rd position for " 1 "; Different state machine in addition will be required, dissimilar output, or different possible computing.

For example, if produce the state machine of X3=001 in addition,, may rebuild whole eight possibility forms of 3 bit words then by the combination of X1, X2 and X3.In other words, X1, X2 and X3 form the complete base (Complete basis) that whole 3 Bit data sequences are rebuild.Fully aware of, export and operate by which output and handle the classification of determining which kind of source by selection mode machine and its, people can simulate and be transmitted as each required analog parameter of simulation.

Only utilize 7 bit data block B1 and B2, the degree of compression that can be realized by such transmission system is quite little, therefore in fact can get greater than 7 bits and come dummy source.On the other hand, people can represent the degree of compression theoretically, and people also can realize the increase greatly of the degree of compression by the size that strengthens data block; Key is to select one group of dateout (base) from some state machines or their equivalent (simplification of each output sequence that is supposed to such as them, the table of storage), organize the computing that is fit to performed on the data with selecting at this, people just can simulate a source desired or that give the phase classification effectively like this.So exactly key provided by the present invention, it determines a complete base that is much higher than 3 rank, can rebuild the source of the very long data sequence of generation according to system of the present invention like this and therefore can realize the data compression of height.

Maximal sequence as complete base

According to the present invention, binary digital data sequence at first partly is transformed to the equivalent sequence of " real number ", so that (for example in the mode that can understand usually, be not carry out various " with " and " or " computing, common take advantage of and add and can utilize) each numeral is carried out common arithmetic operator.A kind of mode of carrying out this computing is that each binary one is distributed to-1 value and each binary zero is distributed to+1 value, so binary sequence (1,0,0,1) is transformed to the sequence that counts (1,1,1 ,-1).This distribution can by known " effectively " (aval) function represent, should " effectively " mean " value of counting ", therefore

Aval(X)=+ the 1(real number) when the X=0(binary number) and

=-1(real number) when the X=1(binary number)

Then carry out according to the arithmetic operator of whole computing utilizations of the present invention to the routine of (aval) equivalence value that counts of data sequence.

Also the data sequence is calculated a base fully according to system of the present invention, the institute that this complete base will all be simulated in a predetermine class is active.The base of this calculating is a set that is called maximal sequence.The theory of maximal sequence derives in one piece of paper, provides simple description below.

Normalization

Turn back to 3 bit digital binary systems, the system with base vector (1,0,0), (0,1,0) and (1,1,0) will be the system of a poor efficiency, because this system may obtain the 3rd vector by getting preceding two vectors with the OR combination.What does not add the 3rd vector, the ability of rebuilding other each vectors is just arranged, because it is similar to other two vectors.In other words, the 3rd vector has the correlation too big with other each vectors.Theoretically, all these vectors or and other each vector quadratures and or and the relation of the combination of other each vectors between all be incoherent.

Provide the data sequence of a length N, have several method to produce one group of N orthogonal basis vector, utilize these vectors people to rebuild source sequence by suitable computing.A kind of usual way that is used to produce these vectors is known from the Gram-Schmidt method.Regrettably, in order to produce one group of N base vector, this method requires about N ²Inferior calculating.Therefore, this method will be carried out the calculating of 1000,000 sub-quantities, and one group of 1000 nonopiate base vector is transformed to the quadrature form.For Fast Compression, the burden of this calculating usually is too heavy, and seems undesirable slow as a rule.The present invention taked one preferably solution be start one group known be quadrature or be the vector of nearly orthogonal at least.

According to the present invention, the maximal sequence of selection length L and its N-1 cycle shift are as basic sequence.Utilizing maximal sequence is that maximal sequence structurally is very complicated as the shortcoming of basic sequence, unless the length of this sequence is the prime number that is referred to as Mersenne, is some prime numbers for length N promptly and has 2 ^p-1 form, wherein p itself is a prime number.(respectively for p=3,5,7 and 13, length 7,35,127 and 8191 is examples of whole Mersenne prime number length)

When N was a Mersenne prime number, a single maximal sequence can expressing length N and its N-1 cycle shift was formed the universal sequence of the full sequence that is used for real number N.In other words, if provide a real number N sequence, people can also constitute a maximal sequence, can be used in any sequence of rebuilding real number N together with the N-1 cycle shift sequence of this maximal sequence (as the ABC that provides above, CAB example, to the right or a step and the cyclic shift of shifting left).

This structure still is highly effective, and people can prove the general base that utilizes the binary number that is not more than p might be configured for the data sequence of practical N, wherein p≤1+log theoretically ₂(M ^(N)+ 1) and M ^(N)It is the minimum Mersenne prime number that is not less than N.If the sequence of a real number is for example arranged, then only need 1+log ₂(31+1)=binary number of 1+5=6 is used to finish reconstruction.In order to rebuild any sequence of N=8191 real number, will only need 13 binary numbers.

Therefore, according to method of the present invention, be that an encode processor will be divided into some data blocks of N binary bits by the data sequence that a source produces, then these data blocks are transformed to the data block (corresponding to the value of counting of binary bits or numeral) of N numeral again.The N value is chosen as being a suitably Mersnne prime number of size, so that can effectively compress, but also the suitable quick calculating that can receive.The big young pathbreaker of N is depended on application and is depended on the knowledge of any source sequence structure.

Then, encoder is the maximal sequence (utilize predetermined with well-defined equation) of each data block computational length N.This maximal sequence adds that its N-1 cycle shift forms the base of a N vector.Yet it should be noted does not need to send whole N vectors; It is just enough to send original maximum vector (0 displacement).Because every other vector has same section, just carried out the displacement of a location number, in case receiver has had the 0th base, then only need to transmit a digital i, stipulate that the 0th base will what claim the position just can obtain a different base vector.

When each continuous source block during sufficiently similar in appearance to that data block that maximal sequence is calculated, with the value that is sent to receiver is that basic each index (degree of the 0th base displacement) and the decoding processor in receiver then can be rebuild this source block.In fact, receiver produces the model or the expression in source, and then receiver is activated the data sequence of reconstructions " really " source generation again.If the difference of the data block of continuous data block and front too big (determining by known formula), then encode processor recomputates a new maximal sequence and encodes and carry out compression step again.

According to the present invention, the Code And Decode processor is also carried out some to various data sequences, and other calculate, so that these data conversions are become the form of easier Operations Analyst.Yet, be each data block of Mersenne prime number length with the source data piecemeal, calculate maximum and for the continuation in source is rebuild, only transmit maximum displacement index.According to method of the present invention, compare with the degree of compression that existing compressibility can provide, the bigger data compression degree for the long data sequence is provided.

After above to general description of the present invention, will do more detailed discussion to the present invention below.

Operating principle

General operating principle of the present invention is as described below:

A binary data source (output sequence) is analyzed, determine which finite state machine, the all definite operations of algorithm (being sometimes referred to as the generator algorithm) form (being sometimes referred to as generator classification algorithm) of the label set of which kind of finite state machine produce binary data sequence, will produce list entries.The label of this selected finite state machine is assigned to this list entries and is this list entries distribution source-representation.By select its label from each generator classification algorithm is the algorithm of source-representation of list entries and the operation of this algorithm of initialization, and list entries is reverted to output sequence.

The utilization of source-representation data communication in transfer of data

If utilize the operating unit of source-representation data communication as data transmission system, wherein the source-representation of list entries is that it is represented as a binary sequence (the source-representation sign indicating number of list entries), if with this source-representation sign indicating number be the actual sequence that is transmitted, recover this list entries according to the description of above source-representation data communication as output sequence with the receiving element of system, then:

Wherein the source-representation sign indicating number of the list entries of this list entries has identical bit timing, the effect of source-representation data communication is, this list entries is transmitted and duplicates output sequence with it and utilize the wide transmission factor of a kind of sign indicating number to receive, and this factor is the length (number of binary cell or bit) of list entries and the ratio of the length of the source-representation of list entries (same definition).

The formation of generator classification algorithm

A kind of algorithm is the standard of a series of arithmetic operator process finite aggregates, and algorithm produces some numerical value are distributed to each result (output) by the respectively number of value of distributing to computing (input) is carried out arithmetic operator.A kind of algorithm is described by some algebraic equations.For the purpose of describing, in the standard of this algorithm, the composition of an algorithm is by some The Representation Equation independently.Algorithm has some the still intermediate operations in some intermediate object programs and these intermediate object program, and this algorithm is called and will be recurrence.This conception of species still is the decomposition and the juxtaposed basis of algorithm.Last these attributes of algorithm are strict auxiliary for the discussion here.

Because all members of a generator classification algorithm must produce binary sequence, so this algorithm will always only be considered the binary arithmetic algorithm.In other words, this algorithm only considers that its computing and result are binary variables and by fully comprising binary arithmetic operation, and no matter be algorithm by the formulate of direct or recurrence.This algorithm must be to be about all linear functions (disjunctive normal form) (disjunctive normal form) initial and intermediate operations.Thereby:

Any generator type algorithm is a group of all binary arithmetic algorithm types.

Each limited binary sequence (in any realization, only when limited sequence, source-representation data are by operating) has at least one (ordinary) generator algorithm (trival) generator algorithm.(for binary sequence { a _n: 1≤n≤N }, this ordinary generator algorithm is { X _O=1; X _n=a _nX _O; 1≤n≤N }.) therefore for any finite sequence all exist a generator classification algorithm and hereto sequence have a generator algorithm at least, this algorithm has the minimum number of a clear relation (equation) in its standard.This is a minmal sequence generator.The example of the ordinary generator that has just provided shows that the minimum generator of any length n sequence has individual independently (nonredundant) equation of n+1 at least in its standard.

Therefore, there is a generator classification algorithm in the collection for any finite sequence.If each member of such sequence sets has the expression that counts and make up of the member of a kind of fixed set (collection of basic sequence) according to some sequences, the minimum generator of these basic sequences has formed one group of base algorithm and can have been represented by algebraic combination (basic algorithm synthesis) by the member of one group of base algorithm for all members of the generator classification algorithm of original sequence sets so.These algebraic combination must be to realize according to the formula of being made up of the corresponding standard that counts combination of basic algorithm output (aggregative formula).The composition of these basic algorithm synthesis and other will clearly depend on the form that the basic sequence of original sequence sets is represented.

For the sequence sets that provides, one group of base algorithm is the Algebraic Structure that is common to all generator algorithms.This organizes basic algorithm and can be reduced by abandoning those basic algorithms that utilize other basic algorithm synthesis to come out, and the basic algorithm synthesis of each member of generator classification algorithm then can restart according to one group of base algorithm of this minimum.The base algorithm synthesis can also reduce complexity by the redundancy of going to disappear.The application of these steps will produce a basic algorithm synthesis of minimum for each generator algorithm, but can not guarantee that generally this minimum comprehensively will be unique, for a given generator algorithm, may exist several effective minimums comprehensive.Fact although it is so, the standard of the basic algorithm synthesis of any minimum of a specific generator algorithm is produced unique identification by that generator binary sequence.If the purpose of identification only is restricted to concentrate from original series and selects, the detailed description of the member standard of base generator classification algorithm must relate to this identification, all these sequences have identical base generator group with different only at its basic algorithm synthesis.

Above discussion the accurate reconstruction of the base that is used to discern and each original series is provided:

For any binary sequence collection with for the basic sequence collection of these collection,

Each sequence of these collection has fully been stipulated its expression according to basis set;

A corresponding aggregative formula has fully been stipulated in the expression of each this base;

Each this aggregative formula has fully been stipulated the generator algorithm for original series.

Original series is resumed by carrying out the generator algorithm.

In sum, the basis representation of any original series has stipulated that an aggregative formula and any this aggregative formula realized the recovery of original series.

General basic sequence collection

The mathematical theory of based structures is developed extensively and can be thought that the professional for any art of mathematics is known.The arithmetic computation process that comprises binary number realizes that by utilizing integer valued function (ival) this integer valued function is transformed to binary number their the real integer of using symbology mutually:

Ival(x)=the 1(real number) when the X=1(binary number);

=0(real number) when the X=0(binary number).

The inverse function of ival is expressed as ival ^-1Definition in a usual manner.

Theoretical foundation for the source-representation data communication is according to naming the value of counting for aval() (arithmetic value) and bval(binary value) second pair of conversion of (binary value) develop.

Aval(X)=+ the 1(real number) when the X=0(binary number);

=-1(real number) when the X=1(binary number).

This function is to binary number and get real number value+1 and-1 definition.The bval function is the anti-phase of aval and by following contextual definition:

Bval(X)=the 1(binary number) when the X=-1(real number);

=0(binary number) when the X=+1(real number).

This function is to real number+1 and-1 and get the binary number value defined.

Function ival and aval connect by two equatioies:

aval（X）＝1-2·ival（X）;

ival（X）＝ (1-aval（X）)/2 。

The correlation of two binary sequences is real numbers, and this real number is being measured the similarity degree of these two sequences.Sequence with big absolute correlation to the sequence that is highly similar and has a little absolute correlation to being highly dissimilar.Correlation function value for any two binary sequences is that consistent item by item number deducts item by item inconsistent number.Sequence with zero correlation is accurately to have how many different items how many identical items are just arranged.Fully aware of, its length is that the sequence of odd number can not have zero correlation, and it is+1 or-1 the minimum degree of correlation that can there be correlation in this sequence.Sequence with correlation N is item by item identical and sequence with correlation-N is item by item complementary.

Suppose two binary sequence { X _nAnd { Y _nCorrelation be a kind of very simple form according to the aval function:

Symbol

The binary add that expression is conventional.

In other words, binary sequence relevant is that to provide the conventional numerical value of the evaluation sequence that counts by the aval function relevant.If all binary numbers are represented by its aval evaluation in the source-representation data communication is calculated, correlation can be realized by the relevance formula of routine so.Therefore:

The source-representation Data Communication Realization is to use the aval function by the equivalence value that counts of list entries to replace the processing that all list entries carry out initialization and all source-representation data communication and utilize conventional arithmetic operator to carry out.

In other words, binary data is expressed as+1 and-1 two unique sequence of counting.This expression is understood by the residual term of this expression, and promptly aval annotates and to show deleted, binary data sequence will as have+1 and-1 sequence of real numbers handles.The correlation function simplified formula is like this:

ρ ({X_{n}}, {y_{n}}) = Σ_{R = 1}^{N} X_{n} \cdot Y_{n} \cdot

From standard theoretically, basic correlation is normalized (quadrature and normal), promptly, if two basic sequences are differentiated, then the correlation of two basic sequences has 0 value (quadrature), if two basic sequences are identical, then correlation has 1 value (normally).Normally is to realize easily for any sequence by the multiplication estimation, if but selected correlation has not had this specific character for quadrature, then require the trial of some calculating.The Gram-schmidt method will be transformed to regular correlation to the correlation of any candidate basic element, if but this correlation has general correlation above, and this method is reduced to a kind of simple algorithm estimation.Because general Gram-Schmidt method will be carried out N to the orthogonalization of the N sequence of length N ²Inferior calculating, this will be reduced to the source-representation data communication particularly and start the correlation with sequence of some little general correlation.

Maximal sequence and its cycle shift item by item say it is a very strong candidate for this selection, and the suitable phase cross-correlation of this sequence sets all has identical minimum value (0 ,+1, or-1).The shortcoming of this selection is that maximal sequence has been complicated from structure, unless the length of sequence is the Mersenne prime number, in other words, length N is prime number and has 2 ^P-1 form (P that will be prime number for such N also must be a prime number).The Mersenne prime number has found such numerical value, promptly requires to think that the sequence of Mersenne prime number length depends on that for any a kind of like this source-representation data communication of hypothesis is not the restriction of a reality.

Peaked formulate is as follows:

If if only,

Σ_{n = 0}^{N - 1}

M(n mod N) mod N m((n+k))=N works as K=0

=-1 when K ≠ 0

Then

M(n)=± 1:n=0,1,2 ..., N-1 } and be a maximal sequence.

Then easily obtain:

Σ_{n = 0}^{N - 1} m (n) = &PlusMinus; 1;

Avoided complementary ordinary duplicating by selecting Negative Selection equably.

Is a normalized basis (with respect to the tolerance that is produced by related operation) by the linearity conversion that counts with the set transform of the cycle shift of maximal sequence { m }, if { m } is any unit of this set, then the unit of its corresponding normalized basis is:

M is that each unit is shifted each unit that produces the m after being shifted.

This here processing procedure is called the conversion that counts.It is anti-phase to be:

The n sequence {

Be normalized, because of crossed by its correlation test of direct calculating and therefore any sequence of N real number can be expressed as to uniqueness

The displacement linear combination; For the selection of any Mersenne prime number and any maximal sequence { m(n) }, this peaked displacement has produced the general base for whole number sequences row of expression length N; The computational process of this expression is that the principle of having used the conversion of just having described that oppositely counts that conventional orthogonal basis is calculated realizes.In a word:

When N was any fixing Mersenne prime number, for the sequence of all N real numbers, a single maximal sequence of length N and its N-1 cycle shift had been formed a universal sequence.

The maximal sequence of a given length has structural characteristic, and this characteristic is divided into normal and ordinary variety classes with maximal sequence; When N was a Mersenne prime number, the suitable variety classes of maximal sequence was (N-1)/p.Difference between " normally " on the mathematical meaning and " ordinary " is to understand easily, and does not therefore here go through; Suitably variety classes relates to a kind of like this method from structure, promptly is converted to the form of Abbe group (Abelian group) from a kind of such kind, i.e. these conversion are a kind of iteration of single conversion entirely.

Each Mersenne prime number length 2 ^PThe binary form of these kinds of-1 maximal sequence is to be produced by the binary shift register of length p, and the form of counting of this radix-2 algorithm is:

Wherein number { a(k) } has+1 or-1 value and be the parameter of fully determining the maximal sequence kind.The various combination equivalence values of this expression are useful to calculating, and this being illustrated in realizes that source-representation data communication computational algorithm is used as statement.For the source-representation data communication, such just as seen, the evident characteristic of all these formula is, the first, and accurately utilize a maximal sequence to produce a complete general basis representation and the second, any this maximal sequence requires only by P～log ₂The N binary number a(k): 1≤k≤p } stipulate.Because any attainable finite sequence (a kind of scheme that can imagine is by certain device recording) can embed in the sequence of Mersenne prime number length, and is as described below:

For an attainable sequence sets of length N, can constitute a general base, this general base requires to be not more than P≤(1+log ₂M ^[N]The binary number of)+1 fully stipulates it, wherein M ^[N]It is the minimum Mersenne prime number that is not less than N.

Utilize the segmentation of the sequence sets of general basis representation generation

As can be seen, the cycle shift of each maximal sequence also is a maximal sequence, with each subclass that therefore is segmented into discontinuous maximal sequence be equivalent in this sense, still each other displacement of maximal sequence and they is not only in any two unit of promptly such subclass.The maximal sequence that belongs to different subclass is different substantially, does not have the cycle shift operation can make them identical in this situation.These subclass are referred to as the kind of maximal sequence, generally are that very complicated and quantitative description are difficult with kinds different on structure.Yet, when N is a Mersenne prime number, the kind of the maximal sequence of some length N from structure by a binary value function X(S) limit, the territory of this function is mod(N-1)/nonnegative integer of p.This function is called the characteristics or the feature of kind and has stipulated the symbol distribution of the radix-2 algorithm form of any maximal sequence.

This feature is limited by following relation:

Not all possible this binary function is a feature, but any cycle shift of a feature also is a feature.Each feature is carried out cycle shift operation, form Abbe group (Abelian group), because all cycle shift all are by advance each the iteration of cycle shift of an accurate position.At least always have a feature, this feature is a maximal sequence with respect to this Abelian group, that is, for this feature, only working as υ is the multiple of (N-1)/p, for all S, and χ (S+ υ mod N) ≡ χ (S).This feature and its (N-1)/p-1 cycle shift will be called the true feature of each maximal sequence of length N and the true kind that corresponding kind is called those maximal sequences.Provide the specification of any one feature of true kind of the maximal sequence of a fixing length N Mersenne prime number, the true kind of all of those maximal sequences just can all be identified by the specification for displacement index υ only, and the territory of index υ is the integer that (N-1)/p asks mould simply.

Said structure is a kind of specific realization of ordinary circumstance, when standard basis representation method is applied to any collection of different but equivalence, just produces this structure.Which base that general parameters υ only discerns collection is under consideration; Its structure implication is extremely useful under maximum kind situation, but the basic state of maximum has been kept privilege.Then, generally make B represent the limited different set of bases (finite collection of distinct bases) of length N binary sequence.Make L=|B| be illustrated among the B number of base and make each unit of B given by any definite sequence.Each unit by following order label B:

B＝｛b（ν）：ν＝0，1，2，…，L-1｝.

Here b(υ) be the set of base and the N sequence of forming N real number:

b（ν）＝｛｛b（n，k;ν）：n＝0，1，2，…，N-1｝：k-0，1，2，…，N-1｝

In the realization of above-mentioned maximal sequence, b(n, k; Be that the n coefficient and the L of K item cycle shift of the maximal sequence of kind υ equals (N-1)/p υ).Any binary sequence a() (in fact, any sequence) has a unique basis representation according to these bases each:

Now, if a given sequence a() with respect to a specific base, such as b(0), have basis representation coefficient { α (k): 0≤K≤N-1 }, then any relevant sequence

Can only produce in the regulation by a base index υ.Fairly obvious, all these sequences have a public basis representation d() differently with it only be to select to be used for this basic b(υ that they are rebuild from these public basis representation).Therefore, the sum total represented of possible N item produces the basis representation segmentation of collection of all sequences of length N and the serial B that this structure of cutting apart depends on the collection of each base fully.The basis representation segmentation comprises discontinuous each sequence sets and only wants the regulation of any sequence of given given segmentation of the regulation of its unique base index υ.Comprehensive summary is as follows:

Any limited orderly serial B of normalization base is segmented into the discontinuous sequence sets with public basis representation with the summation of the sequence of regular length; Any sequence is identified at segmentation fully, and this segmentation only belongs to the regulation of the preface index of its relevant base.

The Operations Analyst process of source-representation data communication

Comprehensive and developed various detailed process discussion in the description of the generality analysis of source-representation data communications method, specific as follows:

Stipulate the orderly serial B of a N item normalization base.

An incoming bit stream is segmented into some sequences of common length N.

Select a specific base and calculate the basis representation of first sequence of these output sequences according to selected base.

Each list entries of back is compared with the full fragment sequence that this basis representation produces, seeks first list entries, if with find item by item consistently, then corresponding basic index is recorded the expression number as this particular sequence.When not finding item by item unanimity, then count new basis representation and repeat top process.

The general realization of finite state source representative data communication means

When above-mentioned public base system row method was used to realize the source-representation data communication, the sequence segmentation of the length N that is produced by basis representation was (structure and the preface) that is independent of the base system array structure; They only depend on list entries.These segmentations will be regarded as unique by the N item sequence label with the former selection of basis representation and the 0th base.Therefore, basis representation is independent of each state variable that the particular source that is realizing is represented the history of data communication and constituted system like this.In this case, because two processes of analysis and synthesis are linear (normalization basis representation and reconstruction equation are linear), this source-representation Data Communication Realization is actual to be a linear finite state system, with because list entries only is input to this system (control input), this system is a kind of linear finite state automatic system.Such system had been carried out extensive studies by people and existed very considerable Fundamentals of Mathematics (comprising the software realization) in the document of publishing.

Yet a kind of specific source-representation Data Communication Realization can be formed under the situation of this complexity not having.Initial step comprise select sufficient length Mersenne prime number (for example, preferably, at least 2 ¹³-1=8191 has 630 maximal sequence kinds, or 2 ³¹-1=2147843647 has 69273666 maximal sequence kinds) be the maximal sequence calculated characteristics of this length, the maximal sequence that constitutes each kind is represented.Then the base system row comprise that all displacements by these maximal sequences produce the normalization base.Remaining step is summed up as follows:

Fixed series generator type algorithm for a basic B of normalization comprises all linear combinations from each base element of single basic sequence collection B.Aggregative formula is each unit specification of base system row in this case.If this sequence length is a Mersenne prime number, then aggregative formula can simplify for the maximal sequence of length N and be the generator algorithm.The source-representation of any list entries comprises the data block of continuous binary data, and these data blocks comprise, the initialization basis representation, and the back is the index of basic sequence.

The estimation of the information bit that general source-representation data communication requires

For a length N data block sequence, the source-representation sign indicating number comprises N string N binary number (binary form of the first data block basis representation) to the R data block of input data as what describe in the 7th part, the back and then is the index of R-1 radix, this index is the number from 0 to L-1, and L is the number of different bases in serial B.The bit number that requires for the radix representation of stipulating is N ²And the bit number that the expression base index requires is log ₂L; The bit number that the R data block of the data of expression input then requires is:

N ²+（R-1）log ₂L.

The bit number that the R data block of expression N binary number requires is RN; The ratio of these numbers is:

(N ²＋(R-1)log ₂L)/(RN) = (N)/(R) +(1- 1/(R) ) (log ₂L)/(N)

If requirement, when R=1, this ratio is reduced to N, but when R＞＞during N, this ratio is asymptotic to be arrived

(log ₂L)/(N)

When N is a Mersenne prime number, N～2 ^PAnd L～2 ^P/ P, this ratio becomes like this

(p-log ₂p)/(2 ^P)

With fully aware of, this effective code bandwidth factor (inverse of this ratio) when list entries when single segmentation is the data block of length N remainder, be unbounded.

The organization definition of the kind of maximal sequence

Ask the residual term classification of mould for Mersenne prime number N

The Mersenne prime number is 2 ^PA prime number N of-1 form.Integer P is preface and the integer L=(N-1 that N asks the subgroup of mould residual term multiplication) if/p(N is that a Mersenne prime number N-1 is divisible by P) be the preface that N asks mould residual term multiplication subgroup; This second subgroup is with respect to first subgroup, and asking the quotient group of mould residual term classification with N is homomorphism.Integer 2 be preface be p the subgroup generator and can find an integer g, this integer produces quotient group, promptly as g ^λ=1 mod N ()/() ()/() λ=L or 0.Then, each N asks the unit or 0 or obey 0≤λ≤L-1 for some of mould residual term classification, and the λ of 0≤ρ≤P-1, ρ are g ^λ2 ^pUnique form.λ is that to ask unit of mould L residual term classification and ρ be the unit that P asks mould residual term classification.

Symbol invariant subgroup and feature with respect to the maximal sequence of generator 2

By using known method, can set up any maximal sequence { m(n): 0≤n≤n-1 } of the Mersenne prime number of length N, exist N to ask the fixed number K of mould residual term classification, promptly obtain:

m（（n+k）mod N）＝m（（2n+k）mod N），

n＝0，1，…，N-1.

This result is called the sampling characteristic of maximal sequence " 2 ".When K=0, maximum m is called principal piece (principal phase).Because above-mentioned relation also will be satisfied by the complement code of any sequence that satisfies, pass through m(0 traditionally for the principal piece sequence)=-1 additional requirement realizes monambiguity.

Represent according to residual term classification described above, the sampling characteristic description of " 2 " in the subgroup scope of generator 2 its set of indexes the symbol invariant of each unit of principal piece maximal sequence.The function of this λ is sequence { m } feature;

m（（2 ^ρg ^λ）mod N）＝χ（λ），λ＝0，1，…，L-1.

Because the surplus product representation of index n is only got rid of n=0, this is expressed as follows:

Σ_{n = 1}^{N - 1} m (n) = Σ_{n = 0}^{N - 1} m (n) - m (o) = (- 1) - (- 1) = 0

= Σ_{P = 0}^{P - 1} Σ_{λ = 0}^{L - 1} (m (2^{ρ} g^{λ}) \mod N) = Σ_{p = 0}^{p - 1} Σ_{λ = 0}^{L - 1} x (λ)

= p \cdot Σ_{λ = 0}^{L - 1} x (λ) .

The result is:

Σ_{λ = 0}^{L - 1} x (λ) = 0

The structure of maximal sequence in principal piece

Above stipulated its feature according to the number of maximal sequence, this regulation can be described, so that present some coefficients according to this feature; Used method is his function (δ ()) of Kronecker Dare, only when its independent variable is 0, be defined as have 1 value and otherwise have 0 value;

m (n) = - δ (n) + Σ_{λ = 0}^{L - 1} Σ_{p = 0}^{p - 1} δ ((n - 2^{ρ} g^{λ}) \mod N) x (λ)

Equation for this characterizing definition is to derive from the correlated condition of maximal sequence; When the correlation equation of top formula being brought into maximal sequence, the result is:

r (k) = Σ_{n = 0}^{N - 1} m (n) m ((n + k) \mod N) = (N + 1) δ (k) - 1

(condition of definition maximal sequence)

At K is not 0 o'clock, and it has the feature of asking each unit of multiplication group on each non-zero unit of mould residual term type at N; Therefore, K=g ^λ2 ^ρThe λ value of mod N and last expression is represented to set up at this K.Attention characteristics X(λ) only have value+1 and-1, comprising δ's (K) and having the L value and work as K=0, if requirement, so that part of N that is reduced to of formula.

Expression in the square brackets only depends on that N asks the structure of mould residual term and it calculates the basic step in this formula like this: order:

X (λ_{1}, λ_{2}) = Σ_{p_{1} p_{1} - 0}^{p - 1} δ ((1 - g^{λ_{1}} 2^{ρ_{1}} - g^{λ_{1}} 2^{ρ_{1}}) \mod N)

This equation then:

-x((λ+ \frac{L}{2})mod L)-x(λ)+

+ Σ_{λ_{1,} λ_{= = 0}}^{L - 1} x ((λ_{1}, λ_{2}) x (λ_{1} + λ + \frac{L}{2}) \mod L) x ((λ_{2} + λ) \mod N) = - 1

It is the definite equation (being also referred to as characteristic equation) of feature.After suitable returning formatted, these were separated and are+1 ,-1 sequence and only to repeat they self these sequences under L step period displacement are features of the true kind of each maximal sequence.General only have L this equation separate and they all are displacements mutually.Each is such separates the kind that produces a maximal sequence from the relation of the definition of this section description in principal piece.

The calculating of quotient group generator g

Number N asks mould n ^pScope, n is the unit that N asks the non-zero of mould residual term type, comprises every power of quotient group generator g.This quotient group be the L preface and like this these numbers to ask each power of mould from N of 0 to L-1 all be each time power of quotient group generator.From this manifold, can directly select each initial data g ^L=1.(these are number g, only when λ=L, make g ^λ=1).For carrying out above-mentioned these numbers of process will be enough, and generally the simplest is the most easy-to-use.

Following relation of plane provides the number theory basis for these descriptions:

2 ^p＝1mod N;

n＝g ^λ2 ^ρmod N

()/()

n ^p＝（g ^λ2 ^ρ） ^p＝g ^pλ2 ^pρ＝（g ^p） ^λmod N;

G is g ^LOriginal of of=1mod N

()/() ()/()

g ^pBe g ^L=1mod N(P, L are relative prime numbers) one original;

G is g ^LOriginal of of=1mod N

{ g ^λMod N:0≤λ≤L-1 } be the true Abbe group (Abelian gruop) of L preface.

The general flow figure of source-representation data communication

The source-representation data communications method can comprise two independently systems approaches basically; A method operation list entries is operated this expression and is rebuild original list entries with its expression of generation and another method.These two systems are called analytical system (coding) and integrated system (decoding).Demonstration is found out, analyzes and sends relevant and comprehensive and receive relevant.The general flow figure of these methods as shown in Figure 3, above the various parts that need of system design are also included within wherein.

Claims

1, a kind of data compression system, this system comprises:

Be used for receiving the device of the serial data of representing the information that will transmit at an input;

Be used to select to represent the device of one or more maximal sequences of at least a portion of serial data; With

Be used for to send to the device of an output corresponding to one or more labels of selected each maximal sequence.

2, the data compression system of claim 1, wherein receiving system comprises an encode processor, is used for serial data is divided into the data block of some binary bits and also comprises the device that is used for the binary bits data block is transformed to the data block of numerical value.

3, the data compression system of claim 2, wherein be used to select the device of one or more maximal sequences to comprise the device that is used to first data block to determine a maximal sequence of length N, wherein first data block of binary bits comprises the data block of a N binary bits.

4, the data compression system of claim 3 also comprises being used for and a remote processor communicating devices.

5, the data compression system of claim 4, wherein remote processor comprises the device corresponding to a label of a maximal sequence that is used to receive from dispensing device output;

Be used for comprehensively device corresponding to the maximal sequence serial data of this label from dispensing device.

6, the data compression system of claim 3, wherein N is a Mersenne prime number.

7, the data compression system of claim 6 also comprises being used to select the maximal sequence of at least one length N and the device of number " i ", this number " i " represent at least one maximal sequence will by displacement what.

8, a kind of method of data compression, this method may further comprise the steps:

At the input of a processor, a serial data of the information that the reception expression will transmit;

Select one or more maximal sequences, this sequence has unique label of at least a portion of this serial data of expression; With

Send to the one or more labels of output corresponding to the maximal sequence of representing this serial data part.

9, the data compression method of claim 8, wherein receiving step comprises this serial data is divided into the step that contains the data block of N character in each data block.

10, the data compression method of claim 9, partiting step wherein comprise this serial data step that to be divided into binary bits data block and this binary bits data block of conversion be the numeric data piece.

11, a kind of data compression system, this system comprises:

Be used in the device that input receives the serial data of representing the information that will transmit;

Be used to select the device of one or more equations, the one or more orthogonal basis series of these The Representation Equation, and these orthogonal basis series are represented at least a portion of this serial data, wherein each equation has a corresponding unique label; With

Be used to send to the device of the one or more labels corresponding to selected equation of output.

12, the system of claim 11, wherein choice device comprises the device that is used to select one or more maximal sequences.

13, a kind of form with a plurality of finite state machines constitutes one group of method that is used for the algorithm of data compression system, and this method may further comprise the steps:

Determine a set of equations, the separate equation in this set of equations is quadrature each other, with the orthogonal basis of at least a portion of being formed for representing a data sequence;

Keep one group of orthogonal basis; With

Distribute a unique label for each base, wherein each label list is shown in the corresponding base that forms in definite this set of equations step.

14, the method for claim 13 determines that wherein the step of a set of equations comprises the step of determining at least one maximal sequence.

15, the method for claim 14 determines that wherein at least one maximal sequence comprises the step of the N-1 cycle shift of the maximal sequence of determining at least one length N and this length N maximal sequence.

16, a kind of data transmission method that comprises following each step:

Be one group of operating parameter of at least one finite state machine precomputation;

A data flow is divided into the sequence of each data block;

For each data block, send a deck label, thus some groups that each label is discerned some state machines uniquely a group in operating parameters; With

In receiver, rebuild each data block in response to each label that is sent, as algebraic combination from an output sequence of this finite state machine.

17, a kind of data compression system, this system comprises:

Be used for receiving a serial digital data representing the information that will send at an input;

The device that comprises each system of each equation with corresponding label, thereby and this device be used for selecting one or more representatives at least the label of the part of this serial data produce digit data sequence; With

Be used to send to the device of one or more some labels corresponding to a serial data part of output.