[go: up one dir, main page]

US20220206998A1 - Copying Container Images - Google Patents

Copying Container Images Download PDF

Info

Publication number
US20220206998A1
US20220206998A1 US17/699,741 US202217699741A US2022206998A1 US 20220206998 A1 US20220206998 A1 US 20220206998A1 US 202217699741 A US202217699741 A US 202217699741A US 2022206998 A1 US2022206998 A1 US 2022206998A1
Authority
US
United States
Prior art keywords
data chunk
data
reference count
client device
response
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
US17/699,741
Inventor
Huamin Chen
Dennis Keefe
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Red Hat Inc
Original Assignee
Red Hat Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Red Hat Inc filed Critical Red Hat Inc
Priority to US17/699,741 priority Critical patent/US20220206998A1/en
Assigned to RED HAT, INC. reassignment RED HAT, INC. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: KEEFE, Dennis, CHEN, HUAMIN
Publication of US20220206998A1 publication Critical patent/US20220206998A1/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • G06F16/178Techniques for file synchronisation in file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/13File access structures, e.g. distributed indices
    • G06F16/137Hash-based

Definitions

  • the present disclosure relates generally to file management and replication. More specifically, but not by way of limitation, this disclosure relates to copying container images.
  • Computers use operating systems to manage system processes and resources.
  • Some operating systems such as the Linux operating system, include a low-level software component for managing system processes and resources.
  • the low-level software component is called a “kernel.”
  • the kernel can provide features, such as namespaces and cgroups, for isolating processes and resources from one another. These features can be used to segregate processes and resources (e.g., memory, CPU processing power, and network resources) into relatively isolated virtual-environments called “containers.”
  • Containers can be deployed from image files, which can be referred to as container images.
  • Container images are often stored in an image repository maintained by a server.
  • Devices can communicate with the server to request copies (e.g., “replicas”) of the container images.
  • the server will copy a container image to a destination device by first segmenting the container image into data chunks.
  • a data chunk can be a segment or block of data.
  • the server can then separately query the destination device about each data chunk to determine whether destination device already has the data chunk, and only transmit the data chunks that are absent from the destination device. This can prevent duplicates of data chunks from being copied to the destination device.
  • FIG. 1 is a block diagram of an example of a system for copying container images according to some aspects.
  • FIG. 2 is a block diagram of another example of a system for copying container images according to some aspects.
  • FIG. 3 is a flow chart of an example of a process for copying container images according to some aspects.
  • the traditional approach can involve a server segmenting the container image into a large number (e.g., tens of thousands) of data chunks and individually querying the destination device about each data chunk to determine whether the data chunk is already present at the destination device. But each individual query takes time and bandwidth to complete. As a result, the traditional approach can introduce significant latency into the system and consume valuable bandwidth.
  • Some examples of the present disclosure can overcome one or more of the abovementioned problems by determining a likelihood that a destination device already has a data chunk (e.g., a data segment) and only querying the destination device about the data chunk if the destination device likely has the data chunk. For example, the server can determine a likelihood that the destination device already has a particular data chunk. If the destination device probably has the data chunk, the server can query the destination device to confirm that it has the data chunk. If the destination device probably does not have the data chunk, the server can transmit the data chunk to the destination device, without first querying the destination device.
  • a data chunk e.g., a data segment
  • This process can be applied to some or all of the data chunks in a container image, which may result in many of the data chunks being copied to the destination device without first querying the destination device. This can avoid a significant number of queries typically required by the traditional approach, which can reduce latency and bandwidth consumption.
  • a server can receive a request for a container image in an image repository from a destination device.
  • the server can segment the container image into data chunks.
  • the server can then determine the number of times that each of the data chunks is also present in the other container images in the image repository.
  • the number of times that a particular data chunk is present in the other container images can be referred to as a reference count.
  • the reference count can indicate the likelihood that the destination device already has the particular data chunk.
  • a higher reference count may indicate a higher likelihood (e.g., probability) that the destination device already has the particular data chunk, and a lower reference count may indicate a lower likelihood that the destination device already has the particular data chunk.
  • a particular data chunk may be part of an operating-system layer of a container image.
  • the operating-system layer can include files and libraries for implementing a particular operating system, such as the Red HatTM Linux operating system. Because many other container images may also have the same operating-system layer for implementing the same operating system, the particular data chunk may also be present in many other container images. This can result in the particular data chunk having a relatively high reference count.
  • the server can compare each reference count to a threshold value.
  • a threshold value is 500. If a reference count for a data chunk is above the threshold value, the server can query the destination device to check whether the destination device already has the data chunk. If the destination device does not already have the data chunk, the server can transmit the data chunk to the destination device. If the destination device already has the data chunk, the server can prevent the data chunk from being transmitted to the destination device. This can avoid unnecessary data-chunk duplication and bandwidth consumption. If a reference count for a data chunk is below the threshold value, the server can transmit the data chunk to the destination device, without first querying the destination device. This can avoid unnecessarily querying the destination device in circumstances when the likely result of the query would be negative, thereby reducing latency and improving bandwidth.
  • FIG. 1 is a block diagram of an example of a system 100 for copying container images 104 a - n according to some aspects.
  • the system 100 includes a server 102 which, in some examples, can be a node in a distributed (e.g., cloud) computing environment.
  • the server 102 can manage an image repository that includes the container images 104 a - n.
  • the server 102 uses a content addressable storage (CAS) system 108 to manage the container images 104 a - n .
  • the CAS system 108 can store data chunks of the container images 104 a - n such that the data chunks can be searched and retrieved based on their content (e.g., rather than their storage locations).
  • the CAS system 108 can include a physical storage medium 112 and a data table 110 .
  • An example of the physical storage medium 112 can include a hard disk.
  • the physical storage medium 112 can physically store the data chunks for the container images 104 a - n .
  • the physical storage medium 112 can include data chunks 1-N of container image 104 a .
  • the data table 110 can maintain relationships between (i) hashed versions of the data chunks stored in the physical storage medium 112 , (ii) reference counts for the data chunks, and (iii) logical addresses indicating locations on the physical storage medium 112 at which the data chunks are stored.
  • the hashed versions of the data chunks can serve as keys that can be searched by the CAS system 108 .
  • the server 102 can receive a container image 104 a and segment the container image 104 a into N data chunks of a particular size, such as 8 bits. The server 102 can then determine hashed versions of the data chunks, and search the data table 110 for the presence of each of the hashed versions of the data chunks.
  • the server 102 can increment the reference count for the data chunk by one, without again storing the data chunk in the physical storage medium 112 . This can prevent duplicates of the same data-chunk from being stored in the physical storage medium 112 .
  • HASH(1) can represent the hashed version of data chunk 1 in container image 104 a
  • RefCount1 can represent the reference count for data chunk 1
  • Address1 can represent the logical address for data chunk 1.
  • An example of RefCount1 can be 27 (e.g., if data chunk 1 is in 27 container images in the image repository).
  • HASH(2) can represent the hashed version of data chunk 2 in container image 104 a
  • RefCount2 can represent the reference count for data chunk 2
  • Address2 can represent the logical address for data chunk 2.
  • HASH(N) can represent the hashed version of data chunk N in container image 104 a
  • RefCountN can represent the reference count for data chunk N
  • AddressN can represent the logical address for data chunk N.
  • the server 102 can update the data table 110 using the above process each time the server 102 receives a new container image.
  • the server 102 can also update the data table 110 each time a container image is removed from the repository. For example, if a container image is to be removed from the repository, the server 102 can determine which of the data chunks in the data table 110 belong to the container image and decrement the reference count for each of the data chunks by one. If decrementing a reference count for a data chunk would result in a value of zero, the server 102 can remove the data chunk's row from the data table 110 and delete the data chunk from the physical storage medium 112 . Updating the data table 110 as discussed above can enable the server 102 to maintain an update-to-date log of reference counts for data chunks.
  • the server 102 may receive a request for a container image 104 a from a destination device 106 , such as another server or a client device.
  • the server 102 can segment the container image 104 a into data chunks and search for each data chunk in the data table 110 to determine a reference count corresponding to the data chunk.
  • the reference count can indicate how many times the data chunk is present in the group of container images 104 a - n (or a subset thereof).
  • the server 102 can determine whether the reference count is above or below a threshold value 114 . In some examples, if the reference count is above the threshold value 114 , it may indicate that the destination device 106 likely already has the data chunk.
  • the server 102 can transmit a query communication 116 to the destination device 106 and await a response 118 . If the response 118 indicates that the destination device 106 already has the data chunk, the server 102 can prevent the data chunk from being transmitted to the destination device 106 . If the response 118 indicates that the destination device 106 does not already have the data chunk, the server 102 can transmit the data chunk to the destination device 106 . In some examples, if the reference count for the data chunk is below the threshold value 114 , it may indicate that the destination device 106 likely does not have the data chunk. So, the server 102 can transmit the data chunk to the destination device 106 , without first transmitting a query communication 116 to the destination device 106 .
  • the threshold value 114 can be determined based on a system constraint, a data-chunk size, or both.
  • system constraints can include a latency constraint, such as a maximum amount of latency allowed by the system 100 ; a memory constraint, such as a maximum amount of memory that the server 102 can devote to copying a container image 104 a to a destination device 106 ; a processing constraint, such as a maximum amount of processing power that the server 102 can devote to copying a container image 104 a to a destination device 106 ; or any combination of these.
  • the system constraint may be input by a user or automatically determined by the server 102 after analyzing one or more aspects of the system 100 .
  • server 102 can tailor the threshold value 114 so as to respect a latency requirement of the system 100 .
  • the system 100 can have a relatively high tolerance for latency, the data chunks can be relatively large in size, or both. So, the server 102 can set the threshold value 114 to a lower value in order to enable more query communications 116 to be sent to the destination device 106 . This can reduce the number of data chunks that are sent to the destination device 106 .
  • the system 100 can have a relatively low tolerance for latency, the data chunks can be relatively small in size, or both. So, the server 102 can set the threshold value 114 to a higher value to reduce the number of query communications 116 sent to the destination device 106 . This can also increase the number of data chunks that are sent to the destination device 106 .
  • the server 102 can balance the size of the data chunks with one or more system constraints to determine an appropriate threshold value 114 .
  • FIG. 1 is intended to be illustrative, and other examples can include more components, fewer components, different components, or a different arrangement of the components than shown in FIG. 1 .
  • the server 102 includes the CAS system 108 in FIG. 1 , in other examples the server 102 can be separate from and communicatively coupled to at least a portion of the CAS system 108 .
  • the server 102 includes the container images 104 a - n in FIG. 1
  • the container images 104 - an can be stored in a database that is separate from and accessible to the server 102 (e.g., via a network).
  • FIG. 2 is a block diagram of another example of a system 200 for copying container images 104 a - n according to some aspects.
  • the system 200 includes a processing device 202 communicatively coupled with a memory device 204 .
  • the processing device 202 can include one processing device or multiple processing devices. Non-limiting examples of the processing device 202 include a Field-Programmable Gate Array (FPGA), an application-specific integrated circuit (ASIC), a microprocessor, etc.
  • the processing device 202 can execute instructions 206 stored in the memory device 204 to perform operations.
  • the instructions 206 can include processor-specific instructions generated by a compiler or an interpreter from code written in any suitable computer-programming language, such as C, C++, C#, etc.
  • the memory device 204 can include one memory device or multiple memory devices.
  • the memory device 204 can be non-volatile and may include any type of memory device that retains stored information when powered off.
  • Non-limiting examples of the memory device 204 include electrically erasable and programmable read-only memory (EEPROM), flash memory, or any other type of non-volatile memory.
  • at least some of the memory device can include a medium from which the processing device 202 can read instructions 206 .
  • a computer-readable medium can include electronic, optical, magnetic, or other storage devices capable of providing the processing device 202 with computer-readable instructions or other program code.
  • Non-limiting examples of a computer-readable medium include magnetic disk(s), memory chip(s), ROM, random-access memory (RAM), an ASIC, a configured processor, optical storage, or any other medium from which a computer processor can read the instructions 206 .
  • the memory device 204 can also include any number and combination of container images 104 a - n and a data table 110 , which can indicate reference counts for some or all of the data chunks that form some or all of the container images 104 a - n.
  • the processing device 202 can receive a request for a container image 104 a from a destination device 106 .
  • the processing device 202 can segment the container image 104 a into at least two data chunks, such as data chunk 1 and data chunk N shown in FIG. 2 .
  • the processing device 202 can then determine a respective reference count for each of the data chunks.
  • the processing device 202 can determine a reference count for a data chunk using the data table 110 .
  • the processing device 202 can consult the data table 110 to determine that the reference count for data chunk 1 is 570.
  • the processing device can also consult the data table 110 to determine that the reference count for data chunk N is five.
  • the processing device 202 can determine a reference count for a data chunk “on the fly.” For example, the processing device 202 can segment the other container images 104 b - n into their respective data chunks and count the number of times that data chunk 1 is present in the data chunks of the other container images 104 b - n . The processing device 202 can also count the number of times that data chunk N is present in the data chunks of the other container images 104 b - n.
  • the processing device 202 can determine whether the reference count is above or below the threshold value 114 . For example, the processing device 202 can compare the reference count for data chunk 1 to the threshold value 114 to determine that the reference count for data chunk 1 is above the threshold value 114 (e.g., 100). So, the processing device 202 can query the destination device 106 about whether the destination device 106 already has data chunk 1. If the destination device 106 already has data chunk 1, the processing device 202 can simply move on to assessing the next data chunk, without transmitting data chunk 1 to the destination device 106 . If the destination device 106 does not already have data chunk 1, the processing device 202 can transmit data chunk 1 to the destination device 106 .
  • the threshold value 114 e.g. 100
  • the processing device 202 can compare the reference count for data chunk N to the threshold value 114 to determine that the reference count for data chunk N is below the threshold value 114 (e.g., 100). So, the processing device 202 can transmit data chunk N to the destination device 106 , as represented by data chunk N′ shown in FIG. 2 . The processing device 202 can transmit data chunk N to the destination device 106 , without first querying the destination device 106 about whether the destination device 106 already has data chunk N.
  • the threshold value 114 e.g. 100
  • the processing device 202 can repeat the above process for some or all of the data chunks in the container image 104 a .
  • the destination device 106 can have a copy of the container image 104 a generated using fewer queries than would be traditionally required.
  • the processing device 202 can implement some or all of the steps shown in FIG. 3 .
  • Other examples can include more steps, fewer steps, different steps, or a different combination of steps than are shown in FIG. 3 .
  • the steps of FIG. 3 are discussed below with reference to the components discussed above in relation to FIG. 2 .
  • a processing device 202 segments a container image 104 a into at least two data chunks.
  • the processing device 202 can segment the container image 104 a into data chunks of a predefined size.
  • the container image 104 can be 6 gigabytes (GB) in size. So, the processing device 202 can split the container image 104 into 6,000 data chunks that are 1 megabyte (MB) in size.
  • the processing device 202 determines a respective reference count for each respective data chunk among the at least two data chunks.
  • the respective reference count can indicate how many times the respective data chunk is present in a group of container images 104 a - n.
  • the processing device 202 can determine a reference count for a data chunk by accessing a data table 110 that includes relationships between data chunks and reference counts. In other examples, the processing device 202 can determine a reference count for a data chunk by first segmenting the group of container images 104 b - n into a group of data chunks. The processing device 202 can then count the number of times that the data chunk is present in the group of data chunks. The processing device 202 can use any number and combination of techniques to determine a reference count for a data chunk.
  • the processing device 202 determines that a reference count for a particular data chunk (e.g., data chunk 1) among the at least two data chunks exceeds a threshold value 114 .
  • the processing device 202 queries a destination device 106 about whether the destination device 106 already has the particular data chunk.
  • the processing device 202 can query the destination device 106 in response to determining that the reference count for the particular data chunk exceeds the threshold value 114 .
  • Querying the destination device 106 can involve transmitting a query communication to the destination device 106 .
  • the processing device 202 determines that another reference count for another data chunk (e.g., data chunk N) among the at least two data chunks is below the threshold value 114 .
  • the processing device 202 prevents the destination device 106 from being queried about the other data chunk (e.g., about whether the destination device 106 already has the other data chunk) prior to transmitting the other data chunk to the destination device 106 .
  • the processing device 202 can prevent the destination device 106 from being queried in response to determining that the other reference count for the other data chunk is below the threshold value 114 .

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

Container images can be copied. For example, a computing device can segment a container image into at least two data chunks. The computing device can determine a reference count for a data chunk among the at least two data chunks. The computing device can determine that the reference count for the data chunk is below a threshold value. In response to determining that the reference count for the data chunk is below the threshold value, the computing device can prevent a destination device from being queried about the data chunk prior to transmitting the data chunk to the destination device.

Description

    REFERENCE TO RELATED APPLICATION
  • This application is a continuation of and claims priority to U.S. patent application Ser. No. 16/016,113, filed Jun. 22, 2018 and titled “Copying Container Images,” the entirety of which is hereby incorporated by reference herein.
  • TECHNICAL FIELD
  • The present disclosure relates generally to file management and replication. More specifically, but not by way of limitation, this disclosure relates to copying container images.
  • BACKGROUND
  • Computers use operating systems to manage system processes and resources. Some operating systems, such as the Linux operating system, include a low-level software component for managing system processes and resources. The low-level software component is called a “kernel.” The kernel can provide features, such as namespaces and cgroups, for isolating processes and resources from one another. These features can be used to segregate processes and resources (e.g., memory, CPU processing power, and network resources) into relatively isolated virtual-environments called “containers.” Containers can be deployed from image files, which can be referred to as container images.
  • Container images are often stored in an image repository maintained by a server. Devices can communicate with the server to request copies (e.g., “replicas”) of the container images. Traditionally, the server will copy a container image to a destination device by first segmenting the container image into data chunks. A data chunk can be a segment or block of data. The server can then separately query the destination device about each data chunk to determine whether destination device already has the data chunk, and only transmit the data chunks that are absent from the destination device. This can prevent duplicates of data chunks from being copied to the destination device.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • FIG. 1 is a block diagram of an example of a system for copying container images according to some aspects.
  • FIG. 2 is a block diagram of another example of a system for copying container images according to some aspects.
  • FIG. 3 is a flow chart of an example of a process for copying container images according to some aspects.
  • DETAILED DESCRIPTION
  • There can be disadvantages to the traditional approach for copying a container image to a destination device. For example, the traditional approach can involve a server segmenting the container image into a large number (e.g., tens of thousands) of data chunks and individually querying the destination device about each data chunk to determine whether the data chunk is already present at the destination device. But each individual query takes time and bandwidth to complete. As a result, the traditional approach can introduce significant latency into the system and consume valuable bandwidth.
  • Some examples of the present disclosure can overcome one or more of the abovementioned problems by determining a likelihood that a destination device already has a data chunk (e.g., a data segment) and only querying the destination device about the data chunk if the destination device likely has the data chunk. For example, the server can determine a likelihood that the destination device already has a particular data chunk. If the destination device probably has the data chunk, the server can query the destination device to confirm that it has the data chunk. If the destination device probably does not have the data chunk, the server can transmit the data chunk to the destination device, without first querying the destination device. This process can be applied to some or all of the data chunks in a container image, which may result in many of the data chunks being copied to the destination device without first querying the destination device. This can avoid a significant number of queries typically required by the traditional approach, which can reduce latency and bandwidth consumption.
  • As a specific example, a server can receive a request for a container image in an image repository from a destination device. In response to the request, the server can segment the container image into data chunks. The server can then determine the number of times that each of the data chunks is also present in the other container images in the image repository. The number of times that a particular data chunk is present in the other container images can be referred to as a reference count. The reference count can indicate the likelihood that the destination device already has the particular data chunk. A higher reference count may indicate a higher likelihood (e.g., probability) that the destination device already has the particular data chunk, and a lower reference count may indicate a lower likelihood that the destination device already has the particular data chunk.
  • Some data chunks can have significantly higher reference counts than other data chunks. For example, a particular data chunk may be part of an operating-system layer of a container image. The operating-system layer can include files and libraries for implementing a particular operating system, such as the Red Hat™ Linux operating system. Because many other container images may also have the same operating-system layer for implementing the same operating system, the particular data chunk may also be present in many other container images. This can result in the particular data chunk having a relatively high reference count.
  • After determining the reference counts for the data chunks in a container image, the server can compare each reference count to a threshold value. One example of the threshold value is 500. If a reference count for a data chunk is above the threshold value, the server can query the destination device to check whether the destination device already has the data chunk. If the destination device does not already have the data chunk, the server can transmit the data chunk to the destination device. If the destination device already has the data chunk, the server can prevent the data chunk from being transmitted to the destination device. This can avoid unnecessary data-chunk duplication and bandwidth consumption. If a reference count for a data chunk is below the threshold value, the server can transmit the data chunk to the destination device, without first querying the destination device. This can avoid unnecessarily querying the destination device in circumstances when the likely result of the query would be negative, thereby reducing latency and improving bandwidth.
  • These illustrative examples are given to introduce the reader to the general subject matter discussed here and are not intended to limit the scope of the disclosed concepts. The following sections describe various additional features and examples with reference to the drawings in which like numerals indicate like elements but, like the illustrative examples, should not be used to limit the present disclosure.
  • FIG. 1 is a block diagram of an example of a system 100 for copying container images 104 a-n according to some aspects. The system 100 includes a server 102 which, in some examples, can be a node in a distributed (e.g., cloud) computing environment. The server 102 can manage an image repository that includes the container images 104 a-n.
  • In some examples, the server 102 uses a content addressable storage (CAS) system 108 to manage the container images 104 a-n. The CAS system 108 can store data chunks of the container images 104 a-n such that the data chunks can be searched and retrieved based on their content (e.g., rather than their storage locations). To implement this functionality, the CAS system 108 can include a physical storage medium 112 and a data table 110. An example of the physical storage medium 112 can include a hard disk. The physical storage medium 112 can physically store the data chunks for the container images 104 a-n. For example, as shown in FIG. 1, the physical storage medium 112 can include data chunks 1-N of container image 104 a. The data table 110 can maintain relationships between (i) hashed versions of the data chunks stored in the physical storage medium 112, (ii) reference counts for the data chunks, and (iii) logical addresses indicating locations on the physical storage medium 112 at which the data chunks are stored. The hashed versions of the data chunks can serve as keys that can be searched by the CAS system 108.
  • In one example, the server 102 can receive a container image 104 a and segment the container image 104 a into N data chunks of a particular size, such as 8 bits. The server 102 can then determine hashed versions of the data chunks, and search the data table 110 for the presence of each of the hashed versions of the data chunks. As a particular example, if data chunk N has the value “ABC,” the server 102 can determine that the hashed version of the data chunk is HASH(ABC)=“23FG345A.” The server 102 can then search the data table 110 for “23FG345A.” If the server 102 does not find a hashed version of a data chunk in the data table 110, the server 102 can store the data chunk in the physical storage medium 112. The server 102 can also add a row to the data table 110 that includes (i) the hashed version of the data chunk, (ii) a reference count of 1, and (iii) a logical address indicating the location on the physical storage medium 112 at which the data chunk is stored. If the server 102 finds a hashed version of a data chunk in the data table 110, the server 102 can increment the reference count for the data chunk by one, without again storing the data chunk in the physical storage medium 112. This can prevent duplicates of the same data-chunk from being stored in the physical storage medium 112.
  • One example of the content of the data table 110 is shown in dashed circle 120. HASH(1) can represent the hashed version of data chunk 1 in container image 104 a, RefCount1 can represent the reference count for data chunk 1, and Address1 can represent the logical address for data chunk 1. An example of RefCount1 can be 27 (e.g., if data chunk 1 is in 27 container images in the image repository). HASH(2) can represent the hashed version of data chunk 2 in container image 104 a, RefCount2 can represent the reference count for data chunk 2, and Address2 can represent the logical address for data chunk 2. HASH(N) can represent the hashed version of data chunk N in container image 104 a, RefCountN can represent the reference count for data chunk N, and AddressN can represent the logical address for data chunk N.
  • In some examples, the server 102 can update the data table 110 using the above process each time the server 102 receives a new container image. The server 102 can also update the data table 110 each time a container image is removed from the repository. For example, if a container image is to be removed from the repository, the server 102 can determine which of the data chunks in the data table 110 belong to the container image and decrement the reference count for each of the data chunks by one. If decrementing a reference count for a data chunk would result in a value of zero, the server 102 can remove the data chunk's row from the data table 110 and delete the data chunk from the physical storage medium 112. Updating the data table 110 as discussed above can enable the server 102 to maintain an update-to-date log of reference counts for data chunks.
  • At some point, the server 102 may receive a request for a container image 104 a from a destination device 106, such as another server or a client device. In response, the server 102 can segment the container image 104 a into data chunks and search for each data chunk in the data table 110 to determine a reference count corresponding to the data chunk. The reference count can indicate how many times the data chunk is present in the group of container images 104 a-n (or a subset thereof). After determining a reference count for a data chunk, the server 102 can determine whether the reference count is above or below a threshold value 114. In some examples, if the reference count is above the threshold value 114, it may indicate that the destination device 106 likely already has the data chunk. So, the server 102 can transmit a query communication 116 to the destination device 106 and await a response 118. If the response 118 indicates that the destination device 106 already has the data chunk, the server 102 can prevent the data chunk from being transmitted to the destination device 106. If the response 118 indicates that the destination device 106 does not already have the data chunk, the server 102 can transmit the data chunk to the destination device 106. In some examples, if the reference count for the data chunk is below the threshold value 114, it may indicate that the destination device 106 likely does not have the data chunk. So, the server 102 can transmit the data chunk to the destination device 106, without first transmitting a query communication 116 to the destination device 106.
  • In some examples, the threshold value 114 can be determined based on a system constraint, a data-chunk size, or both. Examples of system constraints can include a latency constraint, such as a maximum amount of latency allowed by the system 100; a memory constraint, such as a maximum amount of memory that the server 102 can devote to copying a container image 104 a to a destination device 106; a processing constraint, such as a maximum amount of processing power that the server 102 can devote to copying a container image 104 a to a destination device 106; or any combination of these. The system constraint may be input by a user or automatically determined by the server 102 after analyzing one or more aspects of the system 100.
  • As a specific example, server 102 can tailor the threshold value 114 so as to respect a latency requirement of the system 100. In one such example, the system 100 can have a relatively high tolerance for latency, the data chunks can be relatively large in size, or both. So, the server 102 can set the threshold value 114 to a lower value in order to enable more query communications 116 to be sent to the destination device 106. This can reduce the number of data chunks that are sent to the destination device 106. In another example, the system 100 can have a relatively low tolerance for latency, the data chunks can be relatively small in size, or both. So, the server 102 can set the threshold value 114 to a higher value to reduce the number of query communications 116 sent to the destination device 106. This can also increase the number of data chunks that are sent to the destination device 106. The server 102 can balance the size of the data chunks with one or more system constraints to determine an appropriate threshold value 114.
  • The example shown in FIG. 1 is intended to be illustrative, and other examples can include more components, fewer components, different components, or a different arrangement of the components than shown in FIG. 1. For instance, although the server 102 includes the CAS system 108 in FIG. 1, in other examples the server 102 can be separate from and communicatively coupled to at least a portion of the CAS system 108. Likewise, although the server 102 includes the container images 104 a-n in FIG. 1, in other examples the container images 104-an can be stored in a database that is separate from and accessible to the server 102 (e.g., via a network).
  • FIG. 2 is a block diagram of another example of a system 200 for copying container images 104 a-n according to some aspects. The system 200 includes a processing device 202 communicatively coupled with a memory device 204. The processing device 202 can include one processing device or multiple processing devices. Non-limiting examples of the processing device 202 include a Field-Programmable Gate Array (FPGA), an application-specific integrated circuit (ASIC), a microprocessor, etc. The processing device 202 can execute instructions 206 stored in the memory device 204 to perform operations. In some examples, the instructions 206 can include processor-specific instructions generated by a compiler or an interpreter from code written in any suitable computer-programming language, such as C, C++, C#, etc.
  • The memory device 204 can include one memory device or multiple memory devices. The memory device 204 can be non-volatile and may include any type of memory device that retains stored information when powered off. Non-limiting examples of the memory device 204 include electrically erasable and programmable read-only memory (EEPROM), flash memory, or any other type of non-volatile memory. In some examples, at least some of the memory device can include a medium from which the processing device 202 can read instructions 206. A computer-readable medium can include electronic, optical, magnetic, or other storage devices capable of providing the processing device 202 with computer-readable instructions or other program code. Non-limiting examples of a computer-readable medium include magnetic disk(s), memory chip(s), ROM, random-access memory (RAM), an ASIC, a configured processor, optical storage, or any other medium from which a computer processor can read the instructions 206.
  • The memory device 204 can also include any number and combination of container images 104 a-n and a data table 110, which can indicate reference counts for some or all of the data chunks that form some or all of the container images 104 a-n.
  • In some examples, the processing device 202 can receive a request for a container image 104 a from a destination device 106. In response, the processing device 202 can segment the container image 104 a into at least two data chunks, such as data chunk 1 and data chunk N shown in FIG. 2. The processing device 202 can then determine a respective reference count for each of the data chunks. In some examples, the processing device 202 can determine a reference count for a data chunk using the data table 110. For example, the processing device 202 can consult the data table 110 to determine that the reference count for data chunk 1 is 570. The processing device can also consult the data table 110 to determine that the reference count for data chunk N is five. In other examples, the processing device 202 can determine a reference count for a data chunk “on the fly.” For example, the processing device 202 can segment the other container images 104 b-n into their respective data chunks and count the number of times that data chunk 1 is present in the data chunks of the other container images 104 b-n. The processing device 202 can also count the number of times that data chunk N is present in the data chunks of the other container images 104 b-n.
  • After determining a reference count for a data chunk, the processing device 202 can determine whether the reference count is above or below the threshold value 114. For example, the processing device 202 can compare the reference count for data chunk 1 to the threshold value 114 to determine that the reference count for data chunk 1 is above the threshold value 114 (e.g., 100). So, the processing device 202 can query the destination device 106 about whether the destination device 106 already has data chunk 1. If the destination device 106 already has data chunk 1, the processing device 202 can simply move on to assessing the next data chunk, without transmitting data chunk 1 to the destination device 106. If the destination device 106 does not already have data chunk 1, the processing device 202 can transmit data chunk 1 to the destination device 106. As another example, the processing device 202 can compare the reference count for data chunk N to the threshold value 114 to determine that the reference count for data chunk N is below the threshold value 114 (e.g., 100). So, the processing device 202 can transmit data chunk N to the destination device 106, as represented by data chunk N′ shown in FIG. 2. The processing device 202 can transmit data chunk N to the destination device 106, without first querying the destination device 106 about whether the destination device 106 already has data chunk N.
  • The processing device 202 can repeat the above process for some or all of the data chunks in the container image 104 a. At the end of this process, the destination device 106 can have a copy of the container image 104 a generated using fewer queries than would be traditionally required.
  • In some examples, the processing device 202 can implement some or all of the steps shown in FIG. 3. Other examples can include more steps, fewer steps, different steps, or a different combination of steps than are shown in FIG. 3. The steps of FIG. 3 are discussed below with reference to the components discussed above in relation to FIG. 2.
  • In block 302, a processing device 202 segments a container image 104 a into at least two data chunks. The processing device 202 can segment the container image 104 a into data chunks of a predefined size. For example, the container image 104 can be 6 gigabytes (GB) in size. So, the processing device 202 can split the container image 104 into 6,000 data chunks that are 1 megabyte (MB) in size.
  • In block 304, the processing device 202 determines a respective reference count for each respective data chunk among the at least two data chunks. The respective reference count can indicate how many times the respective data chunk is present in a group of container images 104 a-n.
  • In some examples, the processing device 202 can determine a reference count for a data chunk by accessing a data table 110 that includes relationships between data chunks and reference counts. In other examples, the processing device 202 can determine a reference count for a data chunk by first segmenting the group of container images 104 b-n into a group of data chunks. The processing device 202 can then count the number of times that the data chunk is present in the group of data chunks. The processing device 202 can use any number and combination of techniques to determine a reference count for a data chunk.
  • In block 306, the processing device 202 determines that a reference count for a particular data chunk (e.g., data chunk 1) among the at least two data chunks exceeds a threshold value 114.
  • In block 308, the processing device 202 queries a destination device 106 about whether the destination device 106 already has the particular data chunk. The processing device 202 can query the destination device 106 in response to determining that the reference count for the particular data chunk exceeds the threshold value 114. Querying the destination device 106 can involve transmitting a query communication to the destination device 106.
  • In block 310, the processing device 202 determines that another reference count for another data chunk (e.g., data chunk N) among the at least two data chunks is below the threshold value 114.
  • In block 312, the processing device 202 prevents the destination device 106 from being queried about the other data chunk (e.g., about whether the destination device 106 already has the other data chunk) prior to transmitting the other data chunk to the destination device 106. The processing device 202 can prevent the destination device 106 from being queried in response to determining that the other reference count for the other data chunk is below the threshold value 114.
  • The foregoing description of certain examples, including illustrated examples, has been presented only for the purpose of illustration and description and is not intended to be exhaustive or to limit the disclosure to the precise forms disclosed. Numerous modifications, adaptations, and uses thereof will be apparent to those skilled in the art without departing from the scope of the disclosure. For instance, any example(s) described herein can be combined with any other example(s).

Claims (20)

1. A system comprising:
a processing device; and
a memory device including instructions that are executable by the processing device for causing the processing device to:
receive a request for a container image from a client device; and
in response to receiving the request:
determine a reference count for a data chunk among at least two data chunks of the container image, the reference count indicating how many times the data chunk is present in a plurality of container images;
determine that the reference count for the data chunk is below a threshold value; and
in response to determining that the reference count for the data chunk is below the threshold value, prevent the client device from being queried about the data chunk prior to transmitting the data chunk to the client device.
2. The system of claim 1, wherein the memory device further includes instructions that are executable by the processing device for causing the processing device to, in response to receiving the request from the client device:
determine another reference count for another data chunk among the at least two data chunks;
determine that the other reference count for the other data chunk exceeds the threshold value;
in response to determining that the other reference count for the other data chunk exceeds the threshold value:
transmit a query communication to the client device;
receive a response to the query communication from the client device; and
transmit the other data chunk to the client device based on the response indicating that the other data chunk does not already exist on the client device.
3. The system of claim 1, wherein the memory device further includes instructions that are executable by the processing device for causing the processing device to, in response to receiving the request from the client device:
determine another reference count for another data chunk among the at least two data chunks;
determine that the other reference count for the other data chunk exceeds the threshold value;
in response to determining that the other reference count for the other data chunk exceeds the threshold value:
transmit a query communication to the client device;
receive a response to the query communication from the client device; and
prevent the other data chunk from being transmitted to the client device based on the response indicating that the other data chunk already exists on the client device.
4. The system of claim 1, wherein the memory device further includes instructions that are executable by the processing device for causing the processing device to, prior to receiving the request:
segment the plurality of container images into a plurality of data chunks of a predefined size, the plurality of data chunks including the at least two data chunks; and
for each respective data chunk among the at least two data chunks, count how many times the respective data chunk is present in the plurality of data chunks to determine a respective reference count for the respective data chunk.
5. The system of claim 1, wherein the memory device further includes instructions that are executable by the processing device for causing the processing device to determine the threshold value based on a system constraint.
6. The system of claim 5, wherein the system constraint includes a latency constraint, a memory constraint, or a processing constraint.
7. The system of claim 1, wherein the memory device further includes instructions that are executable by the processing device for causing the processing device to determine the threshold value based a data-chunk size.
8. A method comprising:
receiving, by a processing device, a request for a container image from a client device; and
in response to receiving the request from the client device:
determining, by the processing device, a reference count for a data chunk among at least two data chunks of the container image, the reference count indicating how many times the data chunk is present in a plurality of container images;
determining, by the processing device, that the reference count for the data chunk is below a threshold value; and
in response to determining that the reference count for the data chunk is below the threshold value, preventing, by the processing device, the client device from being queried about the data chunk prior to transmitting the data chunk to the client device.
9. The method of claim 8, further comprising, in response to receiving the request from the client device:
determining another reference count for another data chunk among the at least two data chunks;
determining that the other reference count for the other data chunk exceeds the threshold value; and
in response to determining that the other reference count for the other data chunk exceeds the threshold value:
transmitting a query communication to the client device;
receiving a response to the query communication from the client device; and
transmitting the other data chunk to the client device based on the response indicating that the other data chunk does not already exist on the client device.
10. The method of claim 8, further comprising, in response to receiving the request from the client device:
determining another reference count for another data chunk among the at least two data chunks;
determining that the other reference count for the other data chunk exceeds the threshold value; and
in response to determining that the other reference count for the other data chunk exceeds the threshold value:
transmitting a query communication to the client device;
receiving a response to the query communication from the client device; and
preventing the other data chunk from being transmitted to the client device based on the response indicating that the other data chunk already exists on the client device.
11. The method of claim 8, further comprising, prior to receiving the request:
segmenting the plurality of container images into a plurality of data chunks of a predefined size, the plurality of data chunks including the at least two data chunks; and
for each respective data chunk among the at least two data chunks, counting how many times the respective data chunk is present in the plurality of data chunks to determine a respective reference count for the respective data chunk.
12. The method of claim 8, further comprising determining the threshold value based on a data-chunk size.
13. The method of claim 8, further comprising determining the threshold value based on a system constraint.
14. The method of claim 13, wherein the system constraint includes a latency constraint, a memory constraint, or a processing constraint.
15. A non-transitory computer-readable medium comprising program code that is executable by a processing device for causing the processing device to:
receive a request for a container image from a client device;
in response to receiving the request:
determine a reference count for a data chunk among at least two data chunks of the container image, the reference count indicating how many times the data chunk is present in a plurality of container images;
determine that the reference count for the data chunk is below a threshold value; and
in response to determining that the reference count for the data chunk is below the threshold value, prevent the client device from being queried about the data chunk prior to transmitting the data chunk to the client device.
16. The non-transitory computer-readable medium of claim 15, further comprising program code that is executable by the processing device for causing the processing device to, in response to receiving the request from the client device:
determine another reference count for another data chunk among the at least two data chunks;
determine that the other reference count for the other data chunk exceeds the threshold value;
in response to determining that the other reference count for the other data chunk exceeds the threshold value:
transmit a query communication to the client device;
receive a response to the query communication from the client device; and
transmit the other data chunk to the client device based on the response indicating that the other data chunk does not already exist on the client device.
17. The non-transitory computer-readable medium of claim 15, further comprising program code that is executable by the processing device for causing the processing device to, in response to receiving the request from the client device:
determine another reference count for another data chunk among the at least two data chunks;
determine that the other reference count for the other data chunk exceeds the threshold value;
in response to determining that the other reference count for the other data chunk exceeds the threshold value:
transmit a query communication to the client device;
receive a response to the query communication from the client device; and
prevent the other data chunk from being transmitted to the client device based on the response indicating that the other data chunk already exists on the client device.
18. The non-transitory computer-readable medium of claim 15, further comprising program code that is executable by the processing device for causing the processing device to, prior to receiving the request:
segment the plurality of container images into a plurality of data chunks of a predefined size, the plurality of data chunks including the at least two data chunks; and
for each respective data chunk among the at least two data chunks, count how many times the respective data chunk is present in the plurality of data chunks to determine a respective reference count for the respective data chunk.
19. The non-transitory computer-readable medium of claim 15, further comprising program code that is executable by the processing device for causing the processing device to, in response to receiving the request and prior to determining the reference count for the data chunk:
identify the at least two data chunks in the container image.
20. The non-transitory computer-readable medium of claim 15, further comprising program code that is executable by the processing device for causing the processing device to determine the threshold value based on a data-chunk size and a system constraint.
US17/699,741 2018-06-22 2022-03-21 Copying Container Images Pending US20220206998A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US17/699,741 US20220206998A1 (en) 2018-06-22 2022-03-21 Copying Container Images

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US16/016,113 US11308038B2 (en) 2018-06-22 2018-06-22 Copying container images
US17/699,741 US20220206998A1 (en) 2018-06-22 2022-03-21 Copying Container Images

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
US16/016,113 Continuation US11308038B2 (en) 2018-06-22 2018-06-22 Copying container images

Publications (1)

Publication Number Publication Date
US20220206998A1 true US20220206998A1 (en) 2022-06-30

Family

ID=68981829

Family Applications (2)

Application Number Title Priority Date Filing Date
US16/016,113 Active 2039-05-12 US11308038B2 (en) 2018-06-22 2018-06-22 Copying container images
US17/699,741 Pending US20220206998A1 (en) 2018-06-22 2022-03-21 Copying Container Images

Family Applications Before (1)

Application Number Title Priority Date Filing Date
US16/016,113 Active 2039-05-12 US11308038B2 (en) 2018-06-22 2018-06-22 Copying container images

Country Status (1)

Country Link
US (2) US11308038B2 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111338827B (en) * 2020-03-20 2023-06-27 抖音视界有限公司 Method and device for pasting form data and electronic equipment

Citations (24)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7095854B1 (en) * 1995-02-13 2006-08-22 Intertrust Technologies Corp. Systems and methods for secure transaction management and electronic rights protection
US20110161723A1 (en) * 2009-12-28 2011-06-30 Riverbed Technology, Inc. Disaster recovery using local and cloud spanning deduplicated storage system
US8161076B1 (en) * 2009-04-02 2012-04-17 Netapp, Inc. Generation and use of a data structure for distributing responsibilities among multiple resources in a network storage system
US20120159098A1 (en) * 2010-12-17 2012-06-21 Microsoft Corporation Garbage collection and hotspots relief for a data deduplication chunk store
US8555022B1 (en) * 2010-01-06 2013-10-08 Netapp, Inc. Assimilation of foreign LUNS into a network storage system
US20130311612A1 (en) * 2012-05-16 2013-11-21 Rackspace Us, Inc. Indirection Objects in a Cloud Storage System
US8712975B2 (en) * 2011-03-08 2014-04-29 Rackspace Us, Inc. Modification of an object replica
US20160330277A1 (en) * 2015-05-06 2016-11-10 International Business Machines Corporation Container migration and provisioning
US9626332B1 (en) * 2013-12-09 2017-04-18 EMC IP Holding Company LLC Restore aware cache in edge device
US9679007B1 (en) * 2013-03-15 2017-06-13 Veritas Technologies Llc Techniques for managing references to containers
US20170177860A1 (en) * 2015-12-18 2017-06-22 Amazon Technologies, Inc. Software container registry container image deployment
US9824095B1 (en) * 2010-05-03 2017-11-21 Panzura, Inc. Using overlay metadata in a cloud controller to generate incremental snapshots for a distributed filesystem
US20170366606A1 (en) * 2014-05-13 2017-12-21 Velostrata Ltd. Real Time Cloud Workload Streaming
US20180075152A1 (en) * 2016-09-13 2018-03-15 Verizon Patent And Licensing Inc. Containerization of network services
US20180081561A1 (en) * 2016-09-16 2018-03-22 Hewlett Packard Enterprise Development Lp Acquisition of object names for portion index objects
US9928210B1 (en) * 2012-04-30 2018-03-27 Veritas Technologies Llc Constrained backup image defragmentation optimization within deduplication system
US20180350180A1 (en) * 2004-06-01 2018-12-06 Daniel William Onischuk Computerized voting system
US20180356964A1 (en) * 2017-06-07 2018-12-13 Sitting Man, Llc Methods, systems, and computer program products for intergrating configuration, monitoring, and operations
US20190098068A1 (en) * 2017-09-25 2019-03-28 Splunk Inc. Customizable load balancing in a user behavior analytics deployment
US20190236163A1 (en) * 2018-01-31 2019-08-01 EMC IP Holding Company LLC Techniques for selectively deactivating storage deduplication
US10437682B1 (en) * 2015-09-29 2019-10-08 EMC IP Holding Company LLC Efficient resource utilization for cross-site deduplication
US20190339889A1 (en) * 2018-05-03 2019-11-07 International Business Machines Corporation Creating a structurally aware block storage system
US10509769B1 (en) * 2014-06-12 2019-12-17 EMC IP Holding Company LLC Method to efficiently track I/O access history
US10740036B2 (en) * 2013-03-12 2020-08-11 Sap Se Unified architecture for hybrid database storage using fragments

Family Cites Families (26)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5893095A (en) * 1996-03-29 1999-04-06 Virage, Inc. Similarity engine for content-based retrieval of images
US6922685B2 (en) * 2000-05-22 2005-07-26 Mci, Inc. Method and system for managing partitioned data resources
US7143087B2 (en) * 2002-02-01 2006-11-28 John Fairweather System and method for creating a distributed network architecture
US7478096B2 (en) * 2003-02-26 2009-01-13 Burnside Acquisition, Llc History preservation in a computer storage system
US8326839B2 (en) * 2009-11-09 2012-12-04 Oracle International Corporation Efficient file access in a large repository using a two-level cache
US8402004B2 (en) 2010-11-16 2013-03-19 Actifio, Inc. System and method for creating deduplicated copies of data by tracking temporal relationships among copies and by ingesting difference data
WO2012147087A1 (en) * 2011-04-29 2012-11-01 Tata Consultancy Services Limited Archival storage and retrieval system
US9639591B2 (en) 2011-06-13 2017-05-02 EMC IP Holding Company LLC Low latency replication techniques with content addressable storage
US20130054906A1 (en) * 2011-08-30 2013-02-28 International Business Machines Corporation Managing dereferenced chunks in a deduplication system
US9298723B1 (en) * 2012-09-19 2016-03-29 Amazon Technologies, Inc. Deduplication architecture
US9471590B2 (en) 2013-02-12 2016-10-18 Atlantis Computing, Inc. Method and apparatus for replicating virtual machine images using deduplication metadata
WO2014185918A1 (en) * 2013-05-16 2014-11-20 Hewlett-Packard Development Company, L.P. Selecting a store for deduplicated data
US10216757B1 (en) * 2014-12-23 2019-02-26 EMC IP Holding Company LLC Managing deletion of replicas of files
US9612749B2 (en) 2015-05-19 2017-04-04 Vmware, Inc. Opportunistic asynchronous deduplication using an in-memory cache
US9769791B2 (en) * 2015-09-04 2017-09-19 Alively, Inc. System and method for sharing mobile video and audio content
US9639558B2 (en) 2015-09-17 2017-05-02 International Business Machines Corporation Image building
US10621151B2 (en) * 2015-09-25 2020-04-14 Netapp Inc. Elastic, ephemeral in-line deduplication service
US10032032B2 (en) * 2015-12-18 2018-07-24 Amazon Technologies, Inc. Software container registry inspection
US10812582B2 (en) * 2016-03-10 2020-10-20 Vmware, Inc. Management of applications across nodes using exo-clones
US10318389B2 (en) * 2016-07-15 2019-06-11 Quantum Corporation Joint de-duplication-erasure coded distributed storage
US10778633B2 (en) * 2016-09-23 2020-09-15 Apple Inc. Differential privacy for message text content mining
US10860536B2 (en) * 2017-01-05 2020-12-08 Portworx, Inc. Graph driver layer management
US10235222B2 (en) * 2017-01-05 2019-03-19 Portworx, Inc. Containerized application system graph driver
US10417184B1 (en) * 2017-06-02 2019-09-17 Keith George Long Widely accessible composite computer file operative in a plurality of forms by renaming the filename extension
US10649861B1 (en) * 2017-08-02 2020-05-12 EMC IP Holding Company LLC Operational recovery of serverless applications in a cloud-based compute services platform
US10684790B2 (en) * 2018-04-27 2020-06-16 EMC IP Holding Company LLC Serverless solution for continuous data protection

Patent Citations (25)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7095854B1 (en) * 1995-02-13 2006-08-22 Intertrust Technologies Corp. Systems and methods for secure transaction management and electronic rights protection
US20180350180A1 (en) * 2004-06-01 2018-12-06 Daniel William Onischuk Computerized voting system
US8161076B1 (en) * 2009-04-02 2012-04-17 Netapp, Inc. Generation and use of a data structure for distributing responsibilities among multiple resources in a network storage system
US20110161723A1 (en) * 2009-12-28 2011-06-30 Riverbed Technology, Inc. Disaster recovery using local and cloud spanning deduplicated storage system
US8555022B1 (en) * 2010-01-06 2013-10-08 Netapp, Inc. Assimilation of foreign LUNS into a network storage system
US9824095B1 (en) * 2010-05-03 2017-11-21 Panzura, Inc. Using overlay metadata in a cloud controller to generate incremental snapshots for a distributed filesystem
US20120159098A1 (en) * 2010-12-17 2012-06-21 Microsoft Corporation Garbage collection and hotspots relief for a data deduplication chunk store
US8712975B2 (en) * 2011-03-08 2014-04-29 Rackspace Us, Inc. Modification of an object replica
US9928210B1 (en) * 2012-04-30 2018-03-27 Veritas Technologies Llc Constrained backup image defragmentation optimization within deduplication system
US20130311612A1 (en) * 2012-05-16 2013-11-21 Rackspace Us, Inc. Indirection Objects in a Cloud Storage System
US10740036B2 (en) * 2013-03-12 2020-08-11 Sap Se Unified architecture for hybrid database storage using fragments
US9679007B1 (en) * 2013-03-15 2017-06-13 Veritas Technologies Llc Techniques for managing references to containers
US9626332B1 (en) * 2013-12-09 2017-04-18 EMC IP Holding Company LLC Restore aware cache in edge device
US20170366606A1 (en) * 2014-05-13 2017-12-21 Velostrata Ltd. Real Time Cloud Workload Streaming
US10509769B1 (en) * 2014-06-12 2019-12-17 EMC IP Holding Company LLC Method to efficiently track I/O access history
US20160330277A1 (en) * 2015-05-06 2016-11-10 International Business Machines Corporation Container migration and provisioning
US10437682B1 (en) * 2015-09-29 2019-10-08 EMC IP Holding Company LLC Efficient resource utilization for cross-site deduplication
US20170177860A1 (en) * 2015-12-18 2017-06-22 Amazon Technologies, Inc. Software container registry container image deployment
US20180075152A1 (en) * 2016-09-13 2018-03-15 Verizon Patent And Licensing Inc. Containerization of network services
US20180081561A1 (en) * 2016-09-16 2018-03-22 Hewlett Packard Enterprise Development Lp Acquisition of object names for portion index objects
US20180356964A1 (en) * 2017-06-07 2018-12-13 Sitting Man, Llc Methods, systems, and computer program products for intergrating configuration, monitoring, and operations
US20190098068A1 (en) * 2017-09-25 2019-03-28 Splunk Inc. Customizable load balancing in a user behavior analytics deployment
US20190236163A1 (en) * 2018-01-31 2019-08-01 EMC IP Holding Company LLC Techniques for selectively deactivating storage deduplication
US10579593B2 (en) * 2018-01-31 2020-03-03 EMC IP Holding Company, LLC Techniques for selectively deactivating storage deduplication
US20190339889A1 (en) * 2018-05-03 2019-11-07 International Business Machines Corporation Creating a structurally aware block storage system

Also Published As

Publication number Publication date
US11308038B2 (en) 2022-04-19
US20190392052A1 (en) 2019-12-26

Similar Documents

Publication Publication Date Title
US11474972B2 (en) Metadata query method and apparatus
JP6373328B2 (en) Aggregation of reference blocks into a reference set for deduplication in memory management
EP3814928B1 (en) System and method for early removal of tombstone records in database
US9792306B1 (en) Data transfer between dissimilar deduplication systems
US9317519B2 (en) Storage system for eliminating duplicated data
US10963454B2 (en) System and method for bulk removal of records in a database
US9020892B2 (en) Efficient metadata storage
US10776345B2 (en) Efficiently updating a secondary index associated with a log-structured merge-tree database
US10013312B2 (en) Method and system for a safe archiving of data
US11409766B2 (en) Container reclamation using probabilistic data structures
US9696919B1 (en) Source/copy reference tracking with block pointer sets
EP2821913A1 (en) A method and system for storing documents
US11100047B2 (en) Method, device and computer program product for deleting snapshots
US20220206998A1 (en) Copying Container Images
US10642789B2 (en) Extended attribute storage
US10552075B2 (en) Disk-image deduplication with hash subset in memory
US20230289349A1 (en) Database management systems using query-compliant hashing techniques
US20220391119A1 (en) Data relocation for data units in scale-out storage systems
US12169484B2 (en) Techniques for adaptive independent compression of key and non-key portions of database rows in index organized tables (IOTs)
CN116540934A (en) A data storage method and device

Legal Events

Date Code Title Description
AS Assignment

Owner name: RED HAT, INC., NORTH CAROLINA

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:CHEN, HUAMIN;KEEFE, DENNIS;SIGNING DATES FROM 20180615 TO 20180622;REEL/FRAME:059329/0645

STPP Information on status: patent application and granting procedure in general

Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION

STPP Information on status: patent application and granting procedure in general

Free format text: NON FINAL ACTION MAILED

STPP Information on status: patent application and granting procedure in general

Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER

STPP Information on status: patent application and granting procedure in general

Free format text: FINAL REJECTION MAILED

STPP Information on status: patent application and granting procedure in general

Free format text: RESPONSE AFTER FINAL ACTION FORWARDED TO EXAMINER

STPP Information on status: patent application and granting procedure in general

Free format text: ADVISORY ACTION MAILED

STCV Information on status: appeal procedure

Free format text: NOTICE OF APPEAL FILED

STCV Information on status: appeal procedure

Free format text: NOTICE OF APPEAL FILED

STCV Information on status: appeal procedure

Free format text: APPEAL BRIEF (OR SUPPLEMENTAL BRIEF) ENTERED AND FORWARDED TO EXAMINER