[go: up one dir, main page]

CN104035838B - The storage log recording of hardware supported - Google Patents

The storage log recording of hardware supported Download PDF

Info

Publication number
CN104035838B
CN104035838B CN201410077960.8A CN201410077960A CN104035838B CN 104035838 B CN104035838 B CN 104035838B CN 201410077960 A CN201410077960 A CN 201410077960A CN 104035838 B CN104035838 B CN 104035838B
Authority
CN
China
Prior art keywords
plid
data
daily record
row
log
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201410077960.8A
Other languages
Chinese (zh)
Other versions
CN104035838A (en
Inventor
D.R.彻里顿
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Intel Corp
Original Assignee
Intel Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US14/178,130 external-priority patent/US9477558B2/en
Application filed by Intel Corp filed Critical Intel Corp
Publication of CN104035838A publication Critical patent/CN104035838A/en
Application granted granted Critical
Publication of CN104035838B publication Critical patent/CN104035838B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present invention relates to the storage log recordings of hardware supported.The change progress log recording of opposite physical storage area includes during log recording time interval:Detect the write operation to the physical storage area, wherein secondary indication corresponding with the physical data row in the physical storage area is changed in said write operation;And record log information associated with said write operation.

Description

The storage log recording of hardware supported
The cross reference of other applications
This application claims the entitled HARDWARE-SUPPORTED MEMORY TEMPORAL submitted on March 8th, 2013 The priority of the U.S. Provisional Patent Application No. 61/775,041 of COPY AND LOGGING, the U.S. Provisional Patent Application go out It is incorporated by reference into this in all purposes.
Technical field
The present invention relates to the storage log recordings of hardware supported(logging).
Background technology
The general requirements of Database Systems are the snapshots that place provides database at the specified time point(That is, copy)Ability. Particularly, " consistency reading " ability of many databases needs to provide place's operation at the specified time point for specific data(Example Such as, database)The state submitted inquiry ability.General case is when corresponding with the beginning of the query processing Between when.Other times are possible and generally require and be supported.For example, inquiry can select to stop doing business to have by yesterday to be more than The gold client of 1000000 dollar orders.Added requirement include restore the database in the failure the state submitted and Time series data related with data set is provided(That is, it is in the upper change with the time of value)Ability.
Typically, consistency implemented in software reads and restores function.It is intensive that existing embodiment typically results in storage Operation, has negative effect to the performance of processor cache, this is because these are operated extra data(For example, daily record Data and/or metadata)It takes in processor cache.Particularly, which usually lies on the table, and waits for and carrys out autonomous memory Data, and it is possible to from processor cache expulsion and relevant other data of ongoing processing, to provide additional number According to space.
In addition, under increased load, affairs usually require the consistency reading to the data block for starting to have changed since inquiry It takes, to bring the cost for making current state return back to the time started a query at.These costs tend to vary with the load in system Increase and increase, leads to bad degradation.
Consistency read and restore the exemplary software embodiment of function further by with run in same system The synchronization overhead of other processor cores, this is because it is necessary to the simultaneously access log of other processor cores in system With the data structure of cache pool.The synchronization is actually cache traffic between additional core, further weakens each core Performance and overall system performance.
Consistency reads and restores the exemplary software embodiment of function dependent on revocation and Redo log.It is above-mentioned Same problem betides the Software Implementation that revocation and Redo log are written in database update, this is because processor Core need continually to access metadata and data with it is synchronous.Particularly, in order to as the newer part in record affairs And revocation record is added to cancel daily record and recast record is added to Redo log, processor needs access and cancel daily record The corresponding data of ending and data corresponding with the ending of Redo log, and then execute writing to the two Enter.The renewal process must also access any attached/management data structures associated with these daily records and storage for holding The code segment of the instruction of these actions of row.It also needs to synchronize to execute more these daily records with other processor cores Newly.Therefore, there is negative effect to performance.
Description of the drawings
Each embodiment of the present invention is disclosed in following specific implementation mode and attached drawing.
Figure 1A is the frame of the embodiment of the system for the temporary copy for illustrating the hardware supported for being configured to provide memory Figure.
Figure 1B is to illustrate the exemplary figure that storage indicates indirectly.
Fig. 1 C are to illustrate another exemplary figure that storage indicates indirectly.
Fig. 2 is the embodiment for illustrating the consistency reading process realized in 100 etc the system of such as Figure 1A Flow chart.
Fig. 3 is the flow chart for the embodiment for illustrating temporary copy process.
Fig. 4 A-4C are the exemplary data graphs for illustrating the data and daily record that are used in example consistency reading process.
Fig. 5 is the exemplary data graph for the embodiment for illustrating merging-update copy procedure.
Fig. 6 A are the figures for the embodiment for illustrating the physical data row in memory.
Fig. 6 B are the figures for illustrating the embodiment that the daily record that the data line based on Fig. 6 A indicates indicates.
Fig. 7 is the flow chart for the embodiment for illustrating the process for generating log information.
Specific implementation mode
The present invention can realize in many ways, including be implemented as:Process;Equipment;System;Material composition;It is embodied in Computer program product on computer readable storage medium;And/or processor, such as following processors:It is configured to hold Row is stored in the instruction that on the memory for being coupled to the processor and/or the memory by being coupled to the processor provides.At this In specification, these embodiments or any other adoptable form of the present invention can be referred to as technology.In general, institute is public The sequence of the step of open procedure can change within the scope of the invention.Unless stated otherwise, such as it is described as being configured to The processor of execution task or the component of memory etc may be implemented as by provisional configuration into the execution times at given time The general component of business or the specific component for being manufactured into execution task.As used herein, term " processor " relates to And one or more device, circuits, and/or it is configured to processing data(Such as computer program instructions)Processing core.
Retouching in detail for one or more embodiments of the invention is together provided with the attached drawing of the diagram principle of the invention below It states.The present invention is described in conjunction with these embodiments, but the present invention is not limited to any embodiments.The scope of the present invention is only by right It is required that limit, and the present invention includes many replacements, modification and equivalent.Many specific details are elaborated in the following description To provide a thorough understanding of the present invention.These details are provided for exemplary purposes, and without some or all of It can implement the present invention according to claim in the case of these specific details.For purposes of clarity, it is not described in Known technologic material in technical field related to the present invention, so as not to unnecessarily obscure the present invention.
Describe the temporary copy and log recording of the hardware supported of memory.In some embodiments, using with center The hardware component of processing unit separation provides hardware supported.In various embodiments, in order to support temporary copy, based on known Storage state and log information generate snapshot.In various embodiments, storage indirectly is based at least partially on to indicate to come really Determine log information.
Figure 1A is the frame of the embodiment of the system for the temporary copy for illustrating the hardware supported for being configured to provide memory Figure.
System 100 includes the one or more central processing unit for being configured to execute program instructions(CPU, also referred to as Application processor or processor)102, the one or more for being configured to provide interim low latency storage to CPU 102 is high Speed caches 104 and is configured to provide the main memory 108 of instruction and data to CPU 102.Main memory 108 is typically With than 104 bigger of cache capacity and the higher stand-by period.In some embodiments, cache is using static state Random access memory(SRAM)It realizes, and main memory is to use dynamic random access memory(DRAM)It realizes. Other embodiment is possible.In addition, the system can have additional storage, such as disk.
The copy of the data frequently used is stored in cache 104.When CPU 102 needs data(For example, working as When using data from database request particular segment), cache 104 is examined first.If do not found in cache 104 Cache miss then occurs for data, and examines main memory 108 with location data.
In this example, storage control 106 is configured to manage the data flow to and from main memory 108(Packet Include instruction), consequently facilitating the access by CPU 102 to main memory 108.Storage control 106 is implemented as dividing with CPU 102 From module, and both parts need not be in direct communication with each other(In other words, they need not have direct interface or connection).It deposits Storage controller 106 and CPU 102 can exchange data via cache 104.
Copy coprocessor(CCP)110 are configured to be cooperated with CPU to support consistency reading and log recording function. As described in more detail herein, CCP 110 is configured to execute such as copy data and offer snapshot etc Action.CCP 110 is considered as the hardware component detached with CPU 102.CCP need not have to be directly connected to CPU(For example, connecing Mouth, bus).In some embodiments, CCP and CPU are implemented on the chip or circuit of separation.In various embodiments, CCP By to docked with CPU 102 from storage control 106 and/or 104 transmission data of cache.In some embodiments In, CCP 110 is implemented as the component detached with storage control, and both parts are led to each other via communication interface Letter.In some embodiments, CCP 110 and storage control 106 are integrated, a part for the circuit as storage control.
Data(Such as other of database or data are collected)It is stored in main memory 108.In some embodiments, Specific memory section is designated as being logged.For example, one or more configuration registers can be arranged to refer to by operating system Surely the address for the memory block being logged and size.The write-in of opposite memory block carries out log recording.In this example, it cancels Daily record 112 and Redo log 114 are maintained by CCP 110 in main memory 108.For specific memory section(For example, particular address The memory page at place), Redo log includes the update of executed, that is, examines the new value lighted from upper one.Cancel daily record includes It examines to light from upper one and be updated by the value of overwrite by these(That is, old value).
In some systems, data are continually submitted, but less frequently are stored data at particular test point Backing storage(For example, it is written to persistent data store, such as disk).Redo log allows through operations described below come in failure Restore submitted state afterwards:From the fast of backing storage read data status at check point corresponding with earlier time According to, and the state submitted in Redo log is then applied to inspection dotted state, so that data mode carries in time The preceding state submitted to the end with log recording.Therefore, Redo log allow system to avoid must be in each submit by original place Update is written out to the cost of long-time memory while still allowing the recovery from the loss of storage state.
Cancel daily record is used to copy until shape by the way that the entry of cancel daily record to be applied to the later time of state in reverse order State returns to the state that it is at the appointed time located to provide the data mode at earlier time by " revocation ".It is somebody's turn to do " later time " General case is current time, in this case, it is known that state it is corresponding with the current state of database.Cancel daily record is just In the realization of atomic transaction(Atomic transaction includes the set for the write operation that must be submitted together or does not include write operation Set), this is because conflicting caused by being written to the different affairs of identical data may be revoked.
For example, the initial storage value in memory block " 1 ", and it is subsequently modified into storage value " 2 ", then, " 1 " is stored in revocation In daily record and " 2 " are stored in Redo log.Given original state " 1 " is simultaneously based on Redo log, it may be determined that later to carry The state of friendship is " 2 ".Give later state " 2 " and cancel daily record, it may be determined that the state more early submitted is " 1 ".
In some embodiments, it indicates to indicate such as 108 etc physical storage to processor using indirect storage, In indirect storage indicates, the real data row in the physical address and physical storage of processor publication(It is also referred to as high Fast cache lines)There are the indirects of certain rank between position.The detailed example that this indirect storage indicates is found in for institute Purposefully with it entirely through 8,407,428 Hes of United States Patent (USP) No. being incorporated by this attorney docket for HICAP001 For all purposes with it entirely through the United States Patent (USP) No. 7 being incorporated by this attorney docket for HICAP003, In 650,460.
Figure 1B is to illustrate the exemplary figure that storage indicates indirectly.In this example, the page in main memory is divided For section or row.Some in these rows are for storing actual data content and being referred to as data line.Some storages in these rows The physics row identifier of reference data row(PLID), and it is referred to as conversion row or indirect row.As indicated, data line 152-156 is deposited Store up real data, and physics row identifier(PLID)P1-P4 is used to form memory corresponding with proper data Data line.Processor(For example, CPU)The addresses PLID that the address calculation issued from processor goes out are used as accessing indirectly By the physical address of the processor publication of the data line of PLID references.For example, PLID P1 and P2 set(Indirect row)Reference data Row 152 and 154 is gathered, corresponding with data content " ABCD ".Another PLID P3 and P4 set reference datas row 156 and 154 Set, it is corresponding with data content " EFCD ".In order to access data content " ABCD ", processor access include PLID P1 with The physical address of the indirect row of P2, and then the data line comprising the data is positioned using these PLID, that is, with PLID1 With the corresponding data lines of PLID2.In some embodiments, storage control is by providing mappings of the PLID to data line come just In data access.Gather including PLID(It quotes the corresponding set for the physical data row for including actual data content)Data knot Structure is referred to as indirect row.Write operation is equivalent to storage at the position in conversion row entry corresponding with writing address PLID changes into different PLID so that different data row is cited.
In some embodiments, by the array for the data line that storage organization for storing data is fixed size, often A data line is addressed by PLID.Reference count is carried out to data row, and the data line can be shared.It in other words, can be with In the presence of multiple PLID of reference individual data row.The size of data line depends on embodiment, and can be in different embodiments It is different.In some embodiments, deduplication is carried out to data row(deduplicate)(In other words, each data line has unique Content, and the PLID for quoting same data content is done so by quoting identical data row).For example, data content " CD " It is used, but is only stored in individual data row by multiple PLID.
In some embodiments, each data line is immutable.In other words, once data line is assigned with particular value, It does not just change within the duration of application.If necessary to which data are written, then by the indirect of the PLID of storage reference legacy data Row entry is changed to the different PLID of storage reference new data.For example, row entry initially stores PLID P1, number of references indirectly According to content AB.If data content needs, which are replaced by, changes into EF, entry is changed into PLID P3.
Technique described herein applies in general to the memory indicated using storage expression indirectly.Although more fully below It discusses the indirect storage similar with content shown in Figure 1B to indicate, but other storage expressions indirectly can be used.Fig. 1 C It is to illustrate another exemplary figure that storage indicates indirectly, wherein PLID is organized into directed acyclic graph(DAG).
Consistency is read
Fig. 2 is the embodiment for illustrating the consistency reading process realized in 100 etc the system of such as Figure 1A Flow chart.In this example, process 200 by CCP in response to being called by the consistency read requests of CPU request.
At 202, the consistency read requests by the snapshot of specific time for memory block are received.Consistency reading is asked Seek the snapshot for including position and memory block with interested memory block(That is, copy)Requested particular point in time is related Information.In some embodiments, consistency read requests are the instructions sent from CPU to CCP via storage control.
At 204, temporary copy operation is executed.
In some embodiments, both cancel daily record and Redo log are used by temporary copy.In some embodiments, Temporary copy operation selects revocation or Redo log including based on context.In some embodiments, temporary copy is being called Daily record is selected before operation, and selected daily record is operated with by temporary copy.The selection can be controlled by CPU, storage The progress such as device, CCP itself.As will be described in further detail, daily record selection depend on consistency reading process be by with In executing destruction operation to obtain the snapshot for the data for being in the state more early submitted, recast operation is still used to carry out to obtain Obtain the snapshot of the data of the state in later submission.In some embodiments, daily record is selected according to the specification of caller; In some embodiments, daily record is selected based on the requested time.
Temporary copy operation includes the known state based on selected daily record, memory block(For example, being in submitted shape The existing snapshot of the memory block of state)Associated timestamp generates snapshot with snapshot.Temporary copy is at the appointed time located to give birth to At the snapshot of memory block.The snapshot of physical storage generated is provided to first processor with by being held in first processor It is capable using.
Fig. 3 is the flow chart for the embodiment for illustrating temporary copy process.Process 300 can be used to implement 204 processes 200.In this example, temporary copy operation is designated as having following function interfaces:
temporalCopy(src, dest, timestamp);
Wherein, src and dest corresponds respectively to source storage location(For example, source cache position)With destination storage location (For example, destination cache location).Time at specified time stamp(Such as:The morning 11 on January 12nd, 2014:00; 201401121100 etc.), function generation includes position src(For example, 0x10001111)The position dest of the buffer status at place (For example, physical address 0x1000000)The caching at place.The storage state of Src is known, and the storage state of dest is to wait for Fixed.In the function interface, it is known that state correspond to current time at src state.In some embodiments, the function Interface can be provided for the specified time in addition to current time(Such as, src is set check point(checkpointed)With retain To the time of disk)The additional parameter of the src states at place.In some embodiments, temporary copy function is called by CPU to indicate CCP executes temporary copy function.
In some embodiments, temporary copy is executed to the memory block including one or more pages.In some embodiments In, memory block is independently of page minor structure.For example, memory block may include multiple indirect rows of indirect storage organization(For example, The array of PLID).For example, size is 4 kilobytes(The size of traditional page)Memory block can be divided into and each have 64 words 64 rows of section.If the size of PLID is 32 bits, drawn per-page using 4 conversion rows of 16 PLID of each storage With the data line in the area.In other embodiments, other memory block/data line/PLID sizes can be used.
In some embodiments, src and dest is specified each indicates individual data structure, offer and source and destination The related additional information in ground memory block itself.For example, in some embodiments, the application by src be appointed as virtual address without It is physical address.In such an embodiment, individual data structure includes the virtual memory mapping of operating system, this can will be with The associated file of source region, for cancelling and daily record and other attributes of the recast to the change in the area(Such as affairs behavior)Refer to It is set to additional information.Dest can be specified similarly.Operating system software converts the virtual address to physical storage locations, Ensure that the physical storage locations include content associated with the logic content and further determine that daily record will be from the additional information It is used by temporary copy.In another embodiment, src is designated as the area in logical data sets.That is, which identify may position L ogical data unit at another physical address or at any physical address for being not located at specified time.In such case Under, realize that the software of the data set maintains the copy of instruction logical data is stored in where(For example, in what check point and height In speed caching), how the other configurations instantiated to the data in memory are joined for associated with src daily record and control Several additional informations.In some embodiments, dest parameters are omitted, and temporary copy is returned to the knot as temporary copy The data of fruit are stored in the instruction of position therein.
In this example, at 302, by the data copy in the storage location of source to destination storage location.Between use Storage is connect to indicate(Such as, those storage expressions indirectly shown in Figure 1B -1C)In embodiment to indicate memory, copy behaviour Make to include PLID of the copy in conversion row.Since the real data row quoted by PLID is not copied, then the data copied Amount can be considerably smaller than all data contents in the memory block of source, keep copy function very efficient.
At 304, known time associated with the known state of source storage location is stabbed(For example, being in known state Current time in the case of current state)Associated specified time stamp is compared with the state to be generated.Compare As a result be used to select appropriate daily record.In some embodiments, known time stamp is specified in the forward direction CCP of temporary copy operation (Or the corresponding position of the entry in daily record).In some embodiments, temporalCopy functions include specify the information one A or multiple additional parameters.
If timestamp is identical(For example, as it is known that both state and specified time stamp both correspond to current time), then known State is identical as designated state, and there is no change.Therefore, the memory block in its known state is created not at 318 Modification copy, and the process terminates at 320.
Than the known time in stamp evening specified time stabs instruction by revocation cause the change of data in the memory block of source come The more early state of data is generated, and therefore, selects cancel daily record.Correspondingly, at 306, cancel daily record is scanned, with Identification at the appointed time can be applied to the change of source memory block submitted between known time.In some embodiments, should Scan stabbed from the ratio known time in cancel daily record it is early most late(Or the day in the case where current time is used as known time The ending of will)Start, also, when reaching the timestamp more early than specified time in daily record or when entire daily record has been swept When retouching, which terminates.At 308, by following this sequences, change is applied to destination cachings:Application changing the latest first Become, the change that at the appointed time source cache is carried out between known time to revocation.It is obtained in the caching of destination Data are the expected datas by specified time.If unidentified go out to change, change is not applied.The process is then at 320 It terminates.
Known shape is had submitted in the memory block of source by re-applying than the early known time of specified time stamp stabs instruction What is occurred after state changes to generate the later state of data, and therefore, selects Redo log.Correspondingly, right at 310 Redo log is scanned, with identify can be applied between known time and specified time source memory block submitted change Become.In some embodiments, the scanning since the ratio known time in Redo log stab it is late most earlier, also, ought be in day When reaching the timestamp more late than specified time in will or when entire daily record has been scanned, which terminates.At 312, By following this sequences, change is applied to destination cachings:Earliest change is applied first, to re-apply when known Between between specified time to source cache carry out change.If unidentified go out to change, change is not applied.The process then exists It is terminated at 320.
In some embodiments, which optionally determines that at the appointed time place is with the presence or absence of the copy of memory block.For example, It keeps memory block to be set the independent daily record of the time of check point, and whether there is at the time to determine using the independent daily record Copy, and tested to revocation/Redo log to determine whether there is further changing for pair snapshot for setting check point.Such as There is the snapshot for setting check point and there is no changing in fruit, then provide the logical copy of snapshot, and never call and as above illustrate Re-create the process of snapshot.
In some embodiments, virtual-to-physical address transitional information is provided to CCP, and CCP is to support using void The temporary copy of quasi- address.It can further use virtual address rather than physical address stores log information.
Fig. 4 A-4C are the exemplary data graphs for illustrating the data and daily record that are used in example consistency reading process.Figure 4A illustrates the data set of the change in experience affairs.In this example, data are stored in fabric memory.Specifically Ground, memory block store indirect row, the PLID set of the corresponding set of the indirect row storage reference data row.Note that the value of PLID It can be arbitrary, and be selected to the first, second, third and fourth data line of reference.
In t0=11:At 00, indirect row storage PLID P0, P1, P2 and P3, respectively reference store the data of A, E, C and F Row.This is the state of the memory block when affairs start initially submitted.Entry is not present in revocation or Redo log.
In t1=11:At 05, the conversion row entry of storage PLID P3 is modified to PLID P9, PLID P9 reference D without It is F.Therefore, cancel daily record has recorded:At time t1, the entry storage PLID P3 from capable beginning offset 3;And Redo log has recorded:At time t1, the entry storage PLID P9 from capable beginning offset 3.
In t2=11:At 10, the conversion row entry of storage PLID P1 is modified to PLID P10, PLID P10 reference B and It is not E.Therefore, the entry of following the description is specified in cancel daily record addition:At time t2, the item from capable beginning offset 2 Mesh stores PLID P1;And Redo log has recorded:At time t2, the entry from capable beginning offset 2 stores PLID 10.At this point, affairs are ready to be submitted.
In some embodiments, change and need to be retracted(It may be due to conflicting with other affairs).Therefore, in Fig. 4 B In, restore snapshot earlier using later snapshot.Known time is 11:10 and specified time be 11:00.To destination The copy of carry out source state(That is, the reference to the identical data row comprising A, B, C and D carries out source PLID P0, P10, P2 and P9 Copy).Cancel daily record is scanned and restores destination data row A, B, C and D set to determine how.According to institute in Fig. 4 A The cancel daily record shown restores second entry to P1 from P 10(So that the data content B of lower layer is reconditioned to E), and by Four entries are restored from P9 to P3(So that data content D is reconditioned to F).The recovery is by obtaining old value from cancel daily record PLID is simultaneously written into specified translation entries and is performed.Generate the destination caching of reference data row A, E, C and F.
In some embodiments, later state is generated using relatively early the snapshot of check point is set.This is illustrated in figure 4 c. Know that the time is 11:00 and specified time be 11:10.It is located in the copy of carry out source PLID P0, P1, P2 and P3 in purpose.Counterweight It is scanned as daily record and is gathered in destination data row A, E, C, F with that will change to re-apply, wherein fourth entry changes from P3 For P9(And data content changes into D from F), and the second data line changes into P10 from P1(And data content changes from E For B).Generate PLID P0 of reference A, B, C and D, the destination caching of P10, P2, P9.
In some embodiments, the scanning of daily record(The 306 of process 300 or 310)Data in source are copied into purpose It is carried out before ground.For each page(Or subpage frame), maintain sets of bits corresponding with data line, wherein each bit pair Ying Yuhang.Known time when known to the state in memory block(Such as, the beginning of affairs)Place's resetting sets of bits.If daily record is remembered It records the particular items indicated in indirect row to be changed, then corresponding bit is marked.Only labeled source PLID is not copied Shellfish is to destination.Still application changes to export the expected data row in destination.Illustrated to use Fig. 4 B and 4C, using than Special mask 0000 indicates the entry 0-3 at affairs beginning.At affairs ending, obtained bit-masks are 0101, this is Since the PLID of reference second and the 4th data line is changed.First and third data line of source cache(PLID P0 and P2)No Become, and therefore, corresponding bit is not labeled.These PLID are copied into the corresponding position in the caching of destination.Second He Fourth entry is labeled due to the change recorded in daily record, and is not copied into the second of destination caching and the 4th and is counted According to row.It replaces, the change according only to daily record is copied into the corresponding position in the caching of destination.In this example, according to Which daily record is used, by the P1 and P3 of reference data row E and F(Fig. 4 B)Or the P10 and P9 of reference data row B and D(Fig. 4 C) Second and the 4th position being copied in row.
Other degenerations or modification of the operation as temporary copy may be implemented in CCP.In some embodiments, CCP is realized " simultaneously " of source to destination copies(That is, the temporary copy in the case of at the appointed time identical with known time), utilizing Exact copies are carried out while PLID copies are as optimization relative to actual copy data.In some embodiments, CCP is realized " removing " on memory block, as the optimization version for copying complete zero source section.In some embodiments, CCP may be implemented to remove Movement on the memory block of each PLID in source region, as the part for being moved to purpose area, so as to avoid drawing Provide with the expense for counting change and simultaneously " removing " in source region.
Merging-update copy
In some embodiments, CCP be configured to execute atom merge-update copy function(Also referred to as merging-update Operation).It is the United States Patent (USP) Shen of HICAP004 in the attorney docket being incorporated to entirely through reference with it for all purposes The details of the operation and its realization please be discussed in 12/804,901.Even if merging-update operation allow when exist with by difference Also merge while updating when the conflict for the modification that thread or process carry out, as long as the conflict is logically consistent and can be solved To reach predictable storage state.
In some embodiments, it updates process or thread and maintains initial data at the beginning of update operation or logic affairs The copy of structure, and execute the update to copy.It, will information associated with initial data structure when completing to update(Such as Pointer)It is compared with information associated with the current version of data structure.If it is directed toward identical structure, there is no punchings Prominent update, and execute and compare and exchange(CAS)Operation is replaced original with the new modified version using data structure Version.However, if initial data structure is different from current data structure, the update of current data structure can be merged into In new modified version, as long as difference is logically consistent.In logic consistent difference be by different threads or into Cheng Jinhang, can be solved and be changed while the storage state consistent with application semantics with reaching.When by multiple threads to storage When the modification consistent in logic that structure carries out is merged, as per thread or process are in an atomic manner and independently right Storage organization carries out its modification.As explained in further detail below, for different types of data, exist and determine that modification is No logically consistent different modes.In some embodiments, using the logic one selected in the set of potential constraint Cause property constrains to determine logical consistency.Once difference is merged, CAS operation is just retried.If difference is logically inconsistent, Such as when two current process each seek to entry being added to the mapping with same keys, merging-update operation failure, and And retry some operations.
In some embodiments, the entry in revocation/Redo log and by Current transaction between the current time The affairs individually submitted it is corresponding to the update of memory block.CCP is configured to be changed by affairs simultaneously in copy memory block Become the row of designated storage area, as long as these changes do not conflict with the more cenotype carried out by Current transaction.In some embodiments, CCP is further configured to solve specific consistent conflict in logic.
Fig. 5 is the exemplary data graph for the embodiment for illustrating merging-update copy procedure.It is illustrated below and comes in conjunction with Fig. 5 It explains and the pseudocode illustrated is copied to merging-update.
As shown in Figure 5, in t0(Original state)Locate, the indirect row in memory block includes difference reference data row A, B and C PLID P1, P2 and P3.Two affairs simultaneously have the copy of the snapshot of indirect row, and each affairs, which copy it, carries out it The one group of change of itself.During the revision, the snapshot of each affairs shooting original state is related to creating reference identical data row A, the indirect row copy of B and C.Correspondingly, the change carried out by an affairs is invisible for another affairs.
First process is incited somebody to action by changing PLID P1 to PLID P4 to change the data quoted from A to A ' PLID P3 change to P5 to change the data quoted from C to D, to change the first position in indirect row.At time t1 It submits and changes, and the indirect row formed by P4, P2 and P5 is known as to the copy of state currently submitted.
Meanwhile second process by changing PLID P2 to PLID P8(And by the data line quoted from B change to B’)Change the second position in indirect row, and by changing PLID P3 to PLID 9(And the data line that will be quoted from C Change to E)To change the third place.Not yet submit the change carried out by the second affairs(And therefore, reference is indicated by dotted line), And the Current transaction that the indirect row formed by P1, P8 and P9 is known as to state copies.In time t2(It is later than t1)Place, second Affairs need to submit its change.It is carried out by two affairs simultaneously due to changing, which undergoes merging-renewal process.
The pseudocode of C patterns is discussed below.In the pseudocode, following pointers are initially specified:Scp is initially pointed to snapshot PLID corresponding with data line A is initially quoted in the first position of copy, therefore, * scp;What ccp was initially pointed to state works as premise The first position of the copy of friendship, therefore * ccp initially quote PLID corresponding with data line A ';And ctp is initially pointed to state Current transaction copy first position, therefore the initial reference PLID corresponding with data line A of * ctp.Each pointer is incremented by By the pointer in advance to quote the PLID of next line.The pseudocode is specified:
For each position corresponding with the data line in the memory block,
If * ccp are changed relative to * scp
If * ctp are equal to * scp // therefore are not changed by Current transaction
* ccp are written to * ctp;
else
// processing write-in-write-in conflict
mergedLine=
lineMergeUpdate(*scp, *ccp, *ctp, mergeCategory);
If merges failure, returns to failure;
MergedLine is written to * ctp;
++scp; ++ccp; ++ctp。
With reference to Fig. 5, for the first data line, * ccp(PLID P4)By relative to * scp(PLID P1)Modification, but * ctp (PLID P1)Equal to * scp(PLID P1).Therefore, which is only changed by an affairs, and * ccp are written to * ctp (PLID P1 are changed to PLID P4).
For the second data line, * ccp(PLID P2)Not by relative to * scp(PLID P2)Modification, therefore, the row is again By an at most affairs modification, and * ctp(PLID P8)It is constant.
For third data line, * ccp(PLID P5)By relative to * scp(PLID P3)Modification, and * ctp(PLID P9)Not with * scp(PLID P3)It is identical.This be referred to as write-in-write-in conflict, due to two affairs seek to identical data into Row changes.Therefore, lineMergeUpdate functions are called to determine whether write-in-write-in conflict is logically consistent, and Merge the conflict under unanimous circumstances.Parameter mergeCategory indicates the form of merging to be used. The default result of lineMergeUpdate is failure(Such as, situation shown in Fig. 5, the number of two of which difference letter D and E Lead to write-in-write-in conflict that is inconsistent in logic and cannot being solved according to content).When lineMergeUpdate fails, in The only current affairs that do not submit.However, specific other kinds of merging is admissible(That is, write-in-write-in conflict is logically Unanimously).For example, if mergeCategory indicates that the value in the data line is considered as counter, lineMergeUpdate Function will determine the difference between snap copy and Current transaction value, and the difference is added to the counter in the row, to provide MergedLine, mergedLine provide the semanteme for solving conflict.MergeCategory can also specify particular constraints. For example, in the case of monotonic increase counter, if the value after merging violate Counter Value must monotonic increase this constraint (Such as when counter is reset by one of affairs), then merge-update operation failure.
In this example, the storage zone state of Current transaction is actually to be created at time t0, using will be in time t2 (The end time of Current transaction)Locate the various updates executed and the snapshot of state changed.Merging-update copy is actually simultaneously Enter the update for having and being submitted to memory block between time t0 and time t2 by other affairs simultaneously.Specifically, if update It can be merged(That is, if there is no conflict or if conflicting logically consistent), then these updates are merged.Cause This, merging-update copy function may be implemented as having the temporary copy for given area of known time started t0 to operate And terminate at t2 at fixed time.Temporary copy operation additionally detects write-in-write-in conflict(For example, by tracking whether The identical positions PLID are changed in multiple daily records from different affairs), and execute union operation when possible.
In some embodiments, each Redo log entry include with the related information of corresponding affairs that is changed, Allow merging-update copy function to determine submitted modification using Redo log and executes merging-update copy behaviour Make.
In some embodiments, merging-is called more in the affairs submission for the modified memory block of each of affairs New copy.Redo log is for detecting any submission conflict, them are solved in possibility and stopping affairs when impossible.It compares Under, in existing system, needs affairs and explicitly examined to be directed to the write-in that whether there is from another affairs to same position It tests, to detect write-in-write-in conflict, which causes a large amount of expenses.In the system for realizing the temporary copy of hardware supported, weight It can be used for detecting write-in-write-in conflict when affairs will submit its change as daily record.In some embodiments, Redo log item Mesh includes which affairs having carried out the related information of change with, and when affairs will submit its change, in Redo log Can application entries positioned and checked to determine whether exist conflict.Identified conflict is solved when possible.If conflict It can not possibly solve, then stop the affairs.
In some embodiments, only when same page by Current transaction and it is another submit while both affairs change When, just call merging-update operation.This is because if the page is only changed by single affairs, there will be no conflict and not It needs to merge.In some embodiments, each physical page includes indicating its metadata changed by multiple affairs, and be somebody's turn to do Metadata information is used by operating system to determine whether to call merging-update operation for the physical page.
Daily record indicates
Fig. 6 A are the figures for the embodiment for illustrating the physical data row in memory.It is shown that physical storage It is divided into subpage frame.Each subpage frame includes the data line of preset number(It is 32 in this example, but in other embodiment In can use other numbers).The start address of subpage frame is represented as subpageAddr.It can be indicated using row mask Row, wherein each bit in row corresponds to particular row.
In this example, row mask is 32 bit values that there is the bit in subpage frame unit often to go, wherein in mask I-th of bit corresponds to the i-th row of subpage frame.Initially, row mask is arranged to default value, such as 0.If row is changed, Its corresponding row mask bit value is arranged to 1.Therefore, it is possible to use the subpage frame more new record with following fields(SPUR) To indicate information related with the position of particular data line and quote the PLID of the data line whether changed:
[subpageAddr, lineMask],
Wherein, subpageAddr is the address of row positioning thereon in subpage frame, and lineMask is row mask, It includes the bit set for the modification state for being used to indicate corresponding row.
The size of subpage frame is multiplied by capable size to determine by the size of lineMask.Using 64 byte lines and 32 bits In the embodiment of lineMask, subpage frame size is 2 kilobytes.
Fig. 6 B are the figures for illustrating the embodiment that the daily record that the data line based on Fig. 6 A indicates indicates.In this example, it removes Pin daily record 602 is represented as the sequence of PLID values corresponding with the data line being overwritten.Similarly, 604 quilt of Redo log Be expressed as with after the modification being written into or the sequence of the corresponding PLID values of new data line.
Each PLID is mapped to corresponding physical data line position and sets.In this example, physical message is stored in metadata To save the memory needed for journal entries in daily record 606.With reference to Fig. 6 A, on each subpage frame, metadata daily record is expressed For the sequence of SPUR.In each SPUR, i-th of bit corresponding with the i-th row on the subpage frame is arranged to designated value (For example, 1), indicate that the row is switched.If row is switched, new PLID is in Redo log and previous PLID is in and removes It sells in daily record.Therefore, identical metadata daily record can be used for generating both cancel daily record and Redo log.
In some embodiments, the size of subpage frame address and row mask field can be further optimized, and especially be existed SPUR sizes be allowed in size be 2 power bit in the case of.The purpose of optimization is:It minimizes and needs to be swept Retouch the amount of the data to execute the part that revocation processing is generated as consistency reading block.For example, 8 bit-masks the case where Under, 0.5 kilobytes of each record covering, therefore in the case of 34 bit page address field, each SPUR is 42 bits, And address the memory that can handle 8 terabytes.To this selection of parameter by the required storage tape for log access Wide is that 64 bit SPUR 70 required substantially percent are used in the case where nearly all more new capital is uniline per-page.It can With based on it is expected that the statistics of line number optimizes to newer per-page during operation.
By reserved instruction SPUR it can store metadata information rather than the special address of actual pages data update Set, additional metadata information is stored in daily record.For example, can by using be reserved with indicate timestamp and not with son Page address(For example, each individual bit is arranged to 1 address wherein)SPUR is written in corresponding address, come when storing Between stab.This special address also referred to as indicates.Metadata information can be similarly processed, the beginnings of such as affairs, affairs Ending etc..By the power address block for each this value reserved 2, can be come using the low-order bit of page address field Enhance the low-order bit in mask field to store big value.For example, the block by using 256 addresses for time address, page 8 bits of low level of face address can be used for enhancing mask field to be directed to the time in the configuration using 16 bit-rows masks Stamp provides 24 bits.
It can be by the way that this to the offset being stored as relative to Mr. Yu's basic value rather than be stored absolute value, to reduce these ginsengs Several size requirements.For example, timestamp can be stored as to the offset relative to certain period basic value.Then, absolute timestamp It can be that 24 bits of offset add 24 bits on period basis, be used for 48 bits in total of effective time stamp.Pass through SPUR is written to daily record come more new period basic value using specialized page corresponding with period register address.
Using the expression, CCP maintains the pointer into revocation and recast PLID daily records, when reading SPUR by these pointers Adjust the PLID numbers indicated in SPUR.Therefore, there is no need to that explicitly the correspondence is stored in daily record.
The SPUR expressions of fixed size also allow to read metadata daily record backward and forward.The expression is also convenient for by CCP Easily produce revocation/Redo log.
Log recording
In some embodiments, one or more areas of the physical storage of application, which are indicated as being, is logged.This can With by the operating system of setting specific configuration register or storage control progress, to indicate the position of this memory block and big It is small.Then, each write operation to the memory block of log recording makes PLID be written together with storing in metadata daily record SPUR copies log area to together.
Fig. 7 is the flow chart for the embodiment for illustrating the process for generating log information.Process 700 can by CCP and/ Or storage control executes.
At 702, the write operation of the physical storage area from CPU to log recording is detected.In some embodiments, Xiang Gao Speed caching or practical basis storage system(For example, main memory)Write operation by the logic in storage control and/or CCP Mark address associated with write operation is examined to detect in memory block by compareing log recording.It is as above begged for for using The indirect storage of opinion indicates and the secondary indication of physical data row is changed in the memory block of expression, write operation(For example, PLID's is interior Hold or which data PLID refers to), but the not data content of change data row itself.
At 704, one or more log recordings associated with write operation are recorded.Specifically, in cancel daily record The old value for recording the content being changed records new value in Redo log, or records the two values in corresponding daily record. In some embodiments, specified configuration information associated with memory block is update cancel daily record, Redo log or the two. The associated identification information of secondary indication and identification information associated with physical data row of content to being changed are remembered Record.In some embodiments, the PLID for being modified to reference different data row is inserted into current in PLID queues(Tail Portion)In next entry at position in appropriate daily record.In addition, being generated based on reference data line corresponding with the change SPUR, and the SPUR is written to metadata daily record.
In some embodiments, when occurring write operation, the set that revocation, recast and SPUR are recorded all is created.So And log recording when being written every time may be inefficient, this is because identical stored fragments may be written into many times.Example Such as, if the PLID in row quotes A first indirectly, then B, then C then carried for tracking before affairs are submitted The purpose of the storage state of friendship, only value C are relevant.Therefore, in some embodiments, log recording is not generation write-in behaviour What work was created that, but it is ready to what the when of submitting created in the affairs for being related to one or more write operations.Shooting is as used Indirect storage organization and the snapshot of memory block that indicates complete this point.
In some embodiments, snapshot is shot(Copy)Include to the storage line in the interested memory block of secondary indication PLID is copied.In some embodiments, the indirect during storage accesses means:It can be related to memory block by copying Real data row that the PLID of connection rather than copy are quoted by PLID creates snapshot.In some embodiments, as specified The degenerated form of temporary copy in the case of time is identical with current time, CCP should be asked and be executed and copied to this of PLID Shellfish, also, there is no revocation or recasts, this is because no content will be changed.Capable reference count and these shared rows Invariance means that copied PLID constitutes the snapshot of storage zone state, even if real data is not yet copied.
In some embodiments, at original state(That is, daily record is remembered before being changed in memory block and undergoing log recording Record the beginning of time interval)Shoot the snapshot of memory block.However, the snapshot for shooting entire area may be computationally expensive.Cause This when detecting first time write operation, answers demand and shoots snapshot in some embodiments.In some embodiments, when It when detecting the first write operation for entire memory block, answers demand and generates snapshot, and if from will be not present daily record item Mesh rises, and memory block is not changed, then does not need snapshot.In some embodiments, with the granularity of sub-district(Such as page)Shooting is fast According to.The page being only actually written is shot during so that it is impinged upon log recording time interval soon.Specifically, it detects to the page It is written for the first time and notifies operating system to create the snapshot of the page.Operating system can call CCP with the auxiliary establishing snapshot. The snapshot of the page is created by the way that the PLID for quoting the data line of the page to be copied to the shade indirect arrangements of the page.Interested Time interval during for being written for the first time to page each of and the repeatedly process, wherein with related information is written every time It is recorded in snapshot data structure.If each PLID is 32 bits and corresponds to 64 bytes(512 bits)Row Size, then be copied with the amount for creating the data of the snapshot of the page only the 1/16 of the size that can be the page.
Correspondingly, the complete snapshot of the current state of memory block is by the page that explicitly takes snapshot as described above and from working as The page composition of preceding state not yet changed.
In the embodiment for supporting snapping technique discussed above, for the page changed, CCP can be by that will deposit PLID in the current state of storage area is compared with PLID those of at the corresponding offset in original state snapshot and will be with day The different current PLID of will is transmitted together with identification information, to create the data being switched during log recording time interval Capable Redo log(In other words, the PLID which data line is just being cited is changed).It can be similarly by identical ratio Cancel daily record relatively is created, to replace only preserves the correspondence PLID from original state snapshot.In some embodiments, The time for executing the operation is the time for submitting affairs.
For lifting Fig. 4 A-4C, cancel daily record and Redo log can be generated using the technology.It is assumed that PLID P0-P3 draw With the data line in same page.When first time write operation occurs on this page, the snapshot of parent page is shot, to Replicate PLID values.When daily record to be generated, the PLID in the current state of memory block is compared with the PLID in snapshot, and And the identification current PLID different from the PLID in original state snapshot and by information preservation to daily record.In addition, once generating Redo log, so that it may to be cancelled by recording corresponding entry in Redo log and recording its respective value in snapshot to export Daily record.For example, referring to Fig. 4 A, 11:At 10, it is assumed that Redo log includes the entry from offset 3 since capable(P9)And Entry in original state snapshot at the offset has the value of P3, it may be determined that cancel daily record further includes the identical of storage value P3 Entry at position.Therefore, Redo log can be based on to record(It includes letter related with the position for the change being logged Breath)And the old value of the corresponding position in original state snapshot come determine the cancel daily record in indirect row record.
The set of example pseudo-code for log information to be attached to revocation and Redo log at submission time is as follows:
Each subpage frame in for snapshots
Often row i in for subpage frames
I-th of PLID in if snapshots is different from i-th of PLID in current subpage frame
If Redo logs record, which is queued to Redo log;
If cancel daily records record, which is queued to cancel daily record;
I-th of the bit being such as arranged in lineMask is recorded in the SPUR of the page;
The SPUR of the subpage frame is queued into metadata daily record;
end。
In some embodiments, exist in indirect storage organization for " modification " mark maintained per PLID entries.It should Mark is arranged when corresponding entry is changed, and can be eliminated under software/hardware control.For example, " modification " is marked Will can be reset at the ending of interested affairs or period.For all purposes with its entirely through be incorporated by This attorney docket is the example of modified logo described in the U.S. Patent application No. 13/712,878 of HICAP010.At this In a little embodiments, those of CCP can be defined the PLID entries of memory block by scanning and will only be denoted as being changed PLID Daily record is copied to create the Redo log for the row changed.
In some embodiments that the beginning of affairs carries out snapshot, in the instruction for the end for receiving affairs(Such as, it passes Instruction is submitted in the preparation of system)When, it is provided to CCP and generates recast and non-log information and be attached to recast and revocation day The instruction of will.When completing the log recording, to metadata daily record write-in end transaction instruction, end transaction instruction includes thing Business id and timestamp.In some embodiments, affairs can be stopped.Therefore, it is submitted or is aborted according to the affairs, carry For the instruction submitted or stopped.In the latter case, the instruction of the beginning of the log recording of the affairs is additionally provided(For example, Timestamp, journal entries number).
In some embodiments, not all data line is all cited counting.For example, can essentially be deposited encountering duplication This row is copied to newline position when data line in the overflow area in storage system.It then, will be related to the row copied The PLID of connection is stored in daily record.
In various embodiments, its can be maintained initial raw with software completion in the case where causing minimum influence to performance Daily record except and the temporary copy by CCP supports.Some in these features of available software support are described below.
In some embodiments, top-level cache is washed into cache when affairs are submitted with software realization or deposited Reservoir so that daily record note is carried out to capable write-in to the part as affairs in a manner of in due course using the completion relative to affairs Record.In some embodiments, as the part for submitting instruction, processor can execute the action.
In some embodiments, it is updating(Such as affairs)Beginning, run in CPU software transmission start affairs And end transaction operation is transmitted at end to CCP, to indicate respectively the beginning and end of affairs.In the instruction for starting affairs When, it distributes transaction identifiers and records current time stamp.
In some embodiments, the log recording generated is directly serialized into external input/output by CCP(I/O)If It is standby(Such as network), rather than store the record into memory.Similarly, CCP will directly can also connect from I/O equipment It receives and the Redo log of de-serialization record is applied to memory block, effectively to make the storage state of memory block shift to an earlier date in time Associated storage state is recorded to Redo log.For example, the first calculate node(For example, computing device)Can effectively by Its storage state sets check point to the second calculate node.Specifically, the first calculate node is by shooting the complete of its storage state The storage state for setting check point is sent to the second calculate node by snapshot to set check point to its storage state.First calculates Node is also recorded using its CCP to generate Redo log, and the record is transmitted to second node, the second section by network connection These Redo logs record is applied to the state for setting check point received from the first calculate node by point, to cause minimum net The newly copied of the storage state of first node is maintained while network and application processing expense.
In some cases, the application of operation is moved to from a network host using the high-efficiency network duplication technology Another, while by copying the storage state for setting check point of the application and hereafter only copy the application sets inspection from previous Test the interruption for the application that the row that state a little has changed is minimized to the operation.In some embodiments, log recording, set Check point and update are executed from CCP before the network transmission into transmission buffer device, to ensure that CCP operations are not flow controls , so as to the limitation of matching network, especially when the network is congested.
Periodically, the CCP of revocation, the recast and/or metadata daily record copied parts-generated can be converted by software Record towards row/page is converted into traditional database form by the journal format of its own, typically then copies result To persistence reservoir, database, disk etc..Example log format has following fields:
Record identifier | affairs id | offset | legacy data value | new data value
Wherein, field corresponds to the identifier of record, performs the inclined of more newer field in newer affairs, the record It moves, the legacy data value of the field and the new data value for being written to the field.The daily record indicates not using PLID, this is because number Do not have on the auxiliary reservoir of the access of same physical grade indirect arrangements according to that can be stored in
In some embodiments, software be used within a specified time maintain the mapping from the page to caching, therefore it can With the binding of the determining given page by modification time to virtual memory address.For example, if physical page P needs are recorded For memory block B in the period between ti to tj, then the CCP log informations generated can be converted into and physics by software The unrelated form of storage address or the form for being adapted at least to the long-term persistent journal record carried out by data base management system.One In a little embodiments, physical storage is mapped to higher plate number according to structure by higher level's software, and the record mapping letter in log recording Breath so that higher level's application can more easily restore or reconstruct proper data with usage log record.For example, software determination is changed The PLID of change corresponds to the record of the employee in employee's database of company, particularly, the length of service field in employee record (years of service field).Therefore, log recording is by Software Create and conversion, to include indicating to change to betide to employ The information of the length of service field of member.The application of usage log can be based on log recording and employee's database snapshot, pass through The length of service field for changing employee according to the value of log recording, efficiently to restore or reconstruct employee's database.
In some embodiments, CCP is provided with logical block corresponding with given physical page or subpage frame and indicates (LB), and the information is automatically recorded in daily record by CCP.
In embodiment, software management is in the part of the daily record in memory, and periodically by the portion of these daily records Divide and is washed into non-volatile storage(Such as disk or FLASH memory), to provide persistently copy.Manage the software of these daily records It is configured to determine in memory whether log buffer has to cache to cancel when receiving the request for snapshot and returns to institute It takes time or the data needed for advanceing to the expected time.If it is not, then the additional daily record needed for being accessed from its permanent reserve position Data and by required additional daily record data transmission to main memory to allow to execute operation.
Log recording and its to the hardware realization of the support of snapshot avoid for execute these action application expense, wrap It includes in agitation treatment device cache to access cost when realizing associated code and data with the log recording.
Hardware realization is also reduced as a part for log processing and is synchronized with other application process(That is, reply For the competition of log recording data structure)Expense.CCP can be by allowing to send out before the operation previously issued is completed New copy function makes full use of storage system to support multiple while operating, to avoid as storage system itself Performance bottleneck except performance limitation.
Disclose the temporary copy and log recording of the hardware supported of memory.Storage indicates to allow with preservation to refer to indirectly Entire row is saved in daily record by the comparable room and time cost of needle, this is because the reference to the row is stored in daily record, Rather than data itself.Indirectly storage indicate to allow by copy to capable reference rather than data itself, use space and when Between efficient mode create storage snapshot.This is impinged upon soon than must be submitted revocation to provide applied to modified state State it is more efficient at current time read general case during carry out " consistency reading ".It also allows with compared with low latitude Between cost preserve the snapshot from the previous time, to reduce repetition consistency read affairs cost.
The technology, which also avoids, to be mediated using write-in about to memory, which originally will be It is absorbed in L1/L2 caches.In other words, it only relies upon and is detected at the point for writing back the row from processor cache Modification, for example, the modification can be compulsory at the ending of bout or affairs.
The technology additionally provides a kind of mode, and daily record note is carried out to row when lacking modified label for determination Record, while avoiding to go being written to daily record until the ending at log recording interval.Write-in of the row to daily record is postponed until day The ending of will intra-record slack byte avoid as to mutually colleague or it is identical(Son)The multiple daily record items for the result of the page being repeatedly written Mesh, and avoid to force and be write out from processor cache.
The technology also allows to simplify daily record in the case of affairs, this is because log recording associated with affairs only exists At the ending of affairs be written, thus, it is assumed that submission in the case of, daily record need not include associated with the affairs stopped Log information.In other words, daily record is only when affairs are very likely to(If affirmative)It will just be written into when will submit. (If not distributed transaction, then it can be affirmative.)This is feasible, and what reason was state impinges upon no daily record soon Make revocation feasible in the case of support.
Snapshot also allows to export cancel daily record information as the difference between snapshot and Redo log.
Hardware log recording technique is also meant:Even if change be if execution by relatively incredible application code really It protects and log recording is carried out to the change.This is because the execution of CCP and CPU operates independently, and therefore, even if using Code improperly executes, and CCP can also carry out information operation of the log recording without influencing CPU.
Although previous embodiment has been described in detail for clarity of understanding, the present invention is unlimited In the details provided.In the presence of many alternatives for realizing the present invention.The disclosed embodiments are illustrative rather than limit Property processed.

Claims (20)

1. a kind of system, including:
Memory, including want the physical storage area of log recording;
First processor is configured as executing instruction and accessing the memory;
Second processor is configured as:
Detect the write operation carried out from the first processor to the physical storage area, wherein said write operation modification Secondary indication corresponding with the physical data row in the physical storage area;And
Associated with the said write operation log information of record, wherein the log information includes and the secondary indication Change associated identification information and identification information associated with the physical data row.
2. system according to claim 1, wherein the identification information packet associated with the modification of the secondary indication Include physics row identifier PLID.
3. system according to claim 1, wherein:
The identification information associated with the modification of the secondary indication includes PLID;
The PLID is corresponding with the secondary indication of physical data row changed;And
The log information being recorded includes cancel daily record information.
4. system according to claim 1, wherein:
The identification information associated with the modification of the secondary indication includes PLID;
The secondary indication is original secondary indication;
The PLID is corresponding with the modified secondary indication of original secondary indication;And
The log information being recorded includes Redo log information.
5. system according to claim 1, wherein the identification information associated with the physical data row includes son Renewal of the page records SPUR, and the subpage frame more new record SPUR includes that the address for the subpage frame that data line is located at and row are covered Code.
6. system according to claim 1, wherein the log information is recorded when detecting said write operation 's.
7. system according to claim 1, wherein the log information be to submit it is associated with said write operation Affairs when be recorded.
8. system according to claim 1, wherein daily record further comprises timestamp information.
9. system according to claim 1, wherein the second processor is configured to be based at least partially on At least part of snapshot of the physical storage area determines the log information.
10. system according to claim 1, wherein the second processor is configured to:
Create the snapshot of the physical storage area;And
The modification to the physical storage area is determined based on the current state of the snapshot and the physical storage area.
11. system according to claim 1, wherein the second processor is configured as being used for indirect table by copy Show the PLID of the physical storage area to create the snapshot of the physical storage area.
12. system according to claim 1, wherein the second processor is configured as in log recording time interval Beginning create the snapshot of the physical storage area.
13. system according to claim 1, wherein the second processor is configured as detecting to corresponding sub-district First time write operation when create the snapshot of sub-district in the physical storage area.
14. system according to claim 1, wherein:
The log information includes cancel daily record information;
The second processor be configured as at least partially through corresponding to the secondary indication previous state with it is corresponding Comparing difference generates cancel daily record information between the current state of the secondary indication.
15. system according to claim 1, wherein:
The log information includes cancel daily record information;
The second processor be configured as at least partially through corresponding to the secondary indication previous state with it is corresponding Comparing difference generates cancel daily record information between the current state of the secondary indication;And
The previous state is based on snapshot earlier.
16. system according to claim 1, wherein:
The log information includes cancel daily record information;
The second processor is configured as the record of the Redo log based on the position in the secondary indication and earlier snapshot In the position at value come in the secondary indication generate cancel daily record record.
17. a kind of method that change for the opposite physical storage area during log recording time interval carries out log recording, Including:
Detect the write operation to the physical storage area, wherein in said write operation modification and the physical storage area The corresponding secondary indication of physical data row;And
Associated with the said write operation log information of record, wherein the log information includes and the secondary indication Change associated identification information and identification information associated with the physical data row.
18. according to the method for claim 17, wherein the identification information associated with the modification of the secondary indication Including physics row identifier PLID.
19. according to the method for claim 17, wherein the identification information associated with the physical data row includes Subpage frame more new record SPUR, the subpage frame more new record SPUR include address and the row for the subpage frame that data line is located at Mask.
20. according to the method for claim 17, further comprising:It is based at least partially on the physical storage area at least The snapshot of a part determines log information.
CN201410077960.8A 2013-03-08 2014-03-05 The storage log recording of hardware supported Expired - Fee Related CN104035838B (en)

Applications Claiming Priority (6)

Application Number Priority Date Filing Date Title
US201361775041P 2013-03-08 2013-03-08
US61/775,041 2013-03-08
US61/775041 2013-03-08
US14/178,130 2014-02-11
US14/178,130 US9477558B2 (en) 2013-03-08 2014-02-11 Hardware supported memory logging
US14/178130 2014-02-11

Publications (2)

Publication Number Publication Date
CN104035838A CN104035838A (en) 2014-09-10
CN104035838B true CN104035838B (en) 2018-08-14

Family

ID=51466612

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410077960.8A Expired - Fee Related CN104035838B (en) 2013-03-08 2014-03-05 The storage log recording of hardware supported

Country Status (1)

Country Link
CN (1) CN104035838B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR102509913B1 (en) * 2017-01-25 2023-03-14 삼성전자주식회사 Method and apparatus for maximized dedupable memory
US10831666B2 (en) * 2018-10-05 2020-11-10 Oracle International Corporation Secondary storage server caching

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1842789A (en) * 2004-03-29 2006-10-04 微软公司 System and method for a snapshot query during database recovery
US7650460B2 (en) * 2007-01-26 2010-01-19 Hicamp Systems, Inc. Hierarchical immutable content-addressable memory processor

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7761732B2 (en) * 2005-12-07 2010-07-20 International Business Machines Corporation Data protection in storage systems
US8200914B2 (en) * 2008-01-03 2012-06-12 International Business Machines Corporation Apparatus, system, and method for a read-before-write storage controller instruction

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1842789A (en) * 2004-03-29 2006-10-04 微软公司 System and method for a snapshot query during database recovery
US7650460B2 (en) * 2007-01-26 2010-01-19 Hicamp Systems, Inc. Hierarchical immutable content-addressable memory processor

Also Published As

Publication number Publication date
CN104035838A (en) 2014-09-10

Similar Documents

Publication Publication Date Title
US9477558B2 (en) Hardware supported memory logging
US11994974B2 (en) Recording a trace of code execution using reference bits in a processor cache
EP3207471B1 (en) High performance transactions in database management systems
KR102661543B1 (en) Log cache influx by request to higher level cache
US4961134A (en) Method for minimizing locking and reading in a segmented storage space
US5287496A (en) Dynamic, finite versioning for concurrent transaction and query processing
JP2575543B2 (en) Simultaneous access management method
Lu et al. Blurred persistence in transactional persistent memory
US10310963B2 (en) Facilitating recording a trace file of code execution using index bits in a processor cache
TWI237177B (en) Allocating cache lines
CN110018790B (en) Method and system for ensuring data breakdown consistency in persistent memory
CN104881418B (en) The method and apparatus in the quick recycling rollback space for MySQL
WO2016014368A1 (en) High throughput data modifications using blind update operations
CN101784993A (en) Apparatus using flash memory as storage and method of operating the same
CN104519103B (en) Synchronization processing method, server and the related system of network data
TWI856880B (en) Non-transitory computer-readable medium, storage device and storage method
CN104317944B (en) A kind of timestamp dynamic adjustment concurrency control method based on formula
CN113220490A (en) Transaction persistence method and system for asynchronous write-back persistent memory
ZA200307863B (en) System and method for reorganizing stored data.
CN104035838B (en) The storage log recording of hardware supported
CN104035952B (en) The storage temporary copy of hardware supported
CN108733584A (en) Method and apparatus for optimizing data buffer storage
US20200133934A1 (en) Compressed row state information
Nakamura et al. Extending postgreSQL to handle OLXP workloads
JPH05324435A (en) Method for processing directory management and device therefor

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
ASS Succession or assignment of patent right

Owner name: CHERITON DAVID R.

Free format text: FORMER OWNER: HAIKANPU SYSTEM CO., LTD.

Effective date: 20150206

C41 Transfer of patent application or patent right or utility model
TA01 Transfer of patent application right

Effective date of registration: 20150206

Address after: American California

Applicant after: CHERITON DAVID R.

Address before: American California

Applicant before: Hicamp Systems, Inc

C41 Transfer of patent application or patent right or utility model
TA01 Transfer of patent application right

Effective date of registration: 20160324

Address after: American California

Applicant after: Intel Corporation

Address before: American California

Applicant before: CHERITON DAVID R.

GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20180814

Termination date: 20210305