[go: up one dir, main page]

CN111708656A - Container image pulling method and system based on lazy loading mechanism - Google Patents

Container image pulling method and system based on lazy loading mechanism Download PDF

Info

Publication number
CN111708656A
CN111708656A CN202010338603.8A CN202010338603A CN111708656A CN 111708656 A CN111708656 A CN 111708656A CN 202010338603 A CN202010338603 A CN 202010338603A CN 111708656 A CN111708656 A CN 111708656A
Authority
CN
China
Prior art keywords
image
container
data
download
layered
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010338603.8A
Other languages
Chinese (zh)
Inventor
吴恒
张文博
颜博文
钟华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Software of CAS
Original Assignee
Institute of Software of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Software of CAS filed Critical Institute of Software of CAS
Priority to CN202010338603.8A priority Critical patent/CN111708656A/en
Publication of CN111708656A publication Critical patent/CN111708656A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1458Management of the backup or restore process
    • G06F11/1464Management of the backup or restore process for networked environments
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • G06F16/176Support for shared access to files; File sharing support
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1095Replication or mirroring of data, e.g. scheduling or transport for data synchronisation between network nodes

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Quality & Reliability (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Facsimiles In General (AREA)

Abstract

The invention relates to a container mirror image pulling method and a system based on a lazy loading mechanism, which belong to the technical field of cloud service and calculation, and aim to overcome the defects that a common container in the existing service-free architecture has slow cold starting speed during operation and causes continuous interference on other disk read-write applications during pulling of a mirror image.

Description

基于懒加载机制的容器镜像拉取方法及系统Method and system for pulling container image based on lazy loading mechanism

技术领域technical field

本发明涉及一种基于懒加载机制的容器镜像拉取方法及系统,属于云服务和计算技术领域。The invention relates to a method and system for pulling container images based on a lazy loading mechanism, belonging to the technical field of cloud services and computing.

背景技术Background technique

云计算发展日新月异,无服务(Serverless)的架构受到越来越多企业的重视与应用,按照无服务架构的理念以及目前常见的实现方式,单个应用通常运行在无状态的计算容器中,例如容器、Containerd等,这些容器是事件触发的,短暂的(生命周期可能只持续一次调用),这对服务的弹性伸缩以及迭代时冷启动效率提出了更高的要求。目前无服务架构实现中最常使用的容器运行时技术依然为容器,但是使用原生的容器进行容器的启停与容器镜像资源的管理存在着以下的问题:(1)云服务应用迭代周期短,当启动新版本的应用容器时,容器需要将容器镜像的全量数据拉取到本地,当数据量很大时,容器冷启动时间会变得很可观。(2)N个云服务节点启动同一个镜像的容器时,需要拉取N份镜像数据,造成带宽的极大浪费。(3)云服务器通常以多租户的形式运行,因此当容器镜像在高速下载的过程中,会对磁盘造成持续的读写压力,会对同一时间运行的其他IO进程造成干扰。With the rapid development of cloud computing, the serverless architecture is being valued and applied by more and more enterprises. According to the concept of serverless architecture and the current common implementation methods, a single application usually runs in a stateless computing container, such as a container , Containerd, etc. These containers are event-triggered and short-lived (the life cycle may only last for one call), which puts forward higher requirements for the elastic scaling of services and the efficiency of cold start during iteration. At present, the most commonly used container runtime technology in the implementation of the serverless architecture is still the container, but the use of the native container to start and stop the container and the management of the container image resources has the following problems: (1) The cloud service application iteration cycle is short, When starting a new version of the application container, the container needs to pull the full data of the container image to the local. When the amount of data is large, the cold start time of the container will become considerable. (2) When N cloud service nodes start the container of the same image, they need to pull N copies of the image data, resulting in a great waste of bandwidth. (3) Cloud servers usually run in the form of multi-tenancy, so when the container image is downloaded at high speed, it will cause continuous read and write pressure on the disk, which will interfere with other IO processes running at the same time.

发明内容SUMMARY OF THE INVENTION

本发明为克服现有技术的不足,提供一种基于懒加载机制的容器镜像拉取方法及系统,其利用扩展的容器镜像下载组件以及存储驱动组件能够有效解决以上问题,核心理念是将原始压缩形式的镜像分层数据转移到多节点共享的数据存储中心,并以非压缩的形式存储,而在拉取镜像数据时,只建立数据的引用而不是传输全量数据,数据的传输被推迟到了容器的启动时,而由于容器的启动所需的数据比较小,因此能够在保证容器执行效率的情况下极大提升冷启动效率,降低服务节点磁盘压力以及带宽浪费。同时通过扩展原有的容器镜像名称格式,本发明增强了原有容器系统的表达能力,通过使用扩展的容器镜像格式可以保证在不破坏原系统在容器生态层次中的兼容性的同时,为用户提供基于懒加载机制的镜像加载能力。本发明具有扩展性和效率高的特点,以提升现有无服务架构运行时的执行效率以及稳定性。In order to overcome the shortcomings of the prior art, the present invention provides a method and system for pulling a container image based on a lazy loading mechanism, which can effectively solve the above problems by using an extended container image download component and a storage drive component. The core idea is to compress the original The image layered data in the form is transferred to the data storage center shared by multiple nodes, and stored in an uncompressed form. When pulling the image data, only the reference of the data is established instead of transmitting the full amount of data, and the transmission of the data is postponed to the container. Since the data required for container startup is relatively small, the cold start efficiency can be greatly improved while the container execution efficiency is guaranteed, and the disk pressure and bandwidth waste of service nodes can be reduced. At the same time, by extending the original container image name format, the present invention enhances the expressive ability of the original container system, and by using the extended container image format, it can ensure that the compatibility of the original system in the container ecological level is not destroyed, and the user can benefit from it. Provides image loading capability based on lazy loading mechanism. The present invention has the characteristics of high expansibility and high efficiency, so as to improve the execution efficiency and stability of the existing serverless architecture.

本发明采用的技术方案是:The technical scheme adopted in the present invention is:

一种基于懒加载机制的容器镜像拉取方法,包括以下步骤:A method for pulling container images based on a lazy loading mechanism, including the following steps:

目标电子装置的扩展容器守护进程根据镜像下载请求,获取镜像名称,解析镜像名称,选择镜像是以懒加载形式下载还是以常规形式下载;The extension container daemon of the target electronic device obtains the image name according to the image download request, parses the image name, and selects whether to download the image in a lazy loading form or in a regular form;

如果以懒加载形式下载镜像,则从远程容器镜像中心端获取镜像的元数据,根据元数据在本地建立与共享数据存储端中的镜像分层数据的软链接;在容器启动时,通过该软链接,从共享数据存储端下载所需的镜像分层数据;If the image is downloaded in the form of lazy loading, the metadata of the image is obtained from the remote container image center, and a soft link with the image layered data in the shared data storage is established locally according to the metadata; link to download the required image layered data from the shared data storage;

如果以常规形式下载镜像或者以懒加载形式下载镜像失败,则从远程容器镜像中心端下载镜像全量数据。If downloading the image in the regular form or in the lazy loading form fails, download the full image data from the remote container image center.

一种基于懒加载机制的容器镜像拉取系统,包括共享数据存储端、远程容器镜像中心端和作为本地节点的电子装置;A container image pulling system based on a lazy loading mechanism, including a shared data storage end, a remote container image center end, and an electronic device serving as a local node;

共享数据存储端,负责存储未经压缩的镜像分层数据,并为本地节点提供共享的数据存储服务;The shared data storage end is responsible for storing uncompressed image layered data and providing shared data storage services for local nodes;

远程容器镜像中心端,负责存储镜像元数据以及经过压缩的镜像数据;The remote container image center is responsible for storing image metadata and compressed image data;

电子装置,运行扩展容器守护进程,该进程根据镜像下载请求,获取镜像名称,解析镜像名称,选择镜像是以懒加载形式下载还是以常规形式下载;如果以懒加载形式下载镜像,则从远程容器镜像中心端获取镜像的元数据,根据元数据在本地建立与共享数据存储端中的镜像分层数据的软链接;在容器启动时,通过该软链接从共享数据存储端下载所需的镜像分层数据;如果以常规形式下载镜像或者以懒加载形式下载镜像失败,则从远程容器镜像中心端下载镜像全量数据。The electronic device runs the extended container daemon process, which obtains the image name according to the image download request, parses the image name, and selects whether the image should be downloaded in lazy loading form or in regular form; if the image is downloaded in lazy loading form, it will be downloaded from the remote container The image center obtains the metadata of the image, and establishes a local soft link with the image layered data in the shared data storage according to the metadata; when the container starts, it downloads the required image files from the shared data storage through the soft link. Layer data; if downloading the image in the regular form or in the lazy loading form fails, download the full image data from the remote container image center.

进一步地,扩展容器守护进程包括扩展下载组件,该扩展下载组件负责根据镜像下载请求,从远程容器镜像中心端获取镜像的元数据,根据该元数据中的镜像分层数据ID建立与共享数据存储端中的镜像分层数据的软链接;在容器启动时,通过该软链接从共享数据存储端下载所需的镜像分层数据。Further, the extended container daemon includes an extended download component, which is responsible for obtaining the metadata of the image from the remote container image center according to the image download request, and establishes and shares data storage according to the image layered data ID in the metadata. The soft link of the image layered data in the terminal; when the container is started, the required mirror layered data is downloaded from the shared data storage terminal through the soft link.

进一步地,根据元数据在本地建立与共享数据存储端中的镜像分层数据的软链接,是指元数据中包括镜像所有分层数据的唯一DiffID,利用各个分层数据的DiffID计算出分层的ChainID值,利用ChainID值在共享数据存储端搜索相应的镜像分层数据,并在本地建立与该镜像分层数据的软链接。Further, according to metadata, the soft link with the mirror layered data in the shared data storage terminal is established locally, which means that the metadata includes the unique DiffID of mirroring all layered data, and the DiffID of each layered data is used to calculate the layered data. The ChainID value is used to search for the corresponding mirror layered data on the shared data storage side by using the ChainID value, and a soft link with the mirror layered data is established locally.

进一步地,在本地建立与该镜像分层数据的软链接的同时,设置所获取的镜像元数据参数,参数包括分层大小、CacheID信息,以保证容器原有镜像存储服务的正常运行。Further, while locally establishing a soft link with the image layered data, the obtained image metadata parameters are set, and the parameters include layer size and CacheID information, so as to ensure the normal operation of the original image storage service of the container.

进一步地,扩展容器守护进程调用容器默认的镜像下载组件从远程容器镜像中心端下载镜像全量数据。Further, the extended container daemon calls the default image download component of the container to download the full amount of image data from the remote container image center.

进一步地,扩展容器守护进程包括扩展存储驱动组件,所述存储驱动组件在容器启动之前被调用,负责根据容器所使用的镜像ID从本地的镜像分层存储中获取该镜像所需的分层数据,并利用分层数据以Overlay的方式构建容器的根文件系统;并负责在容器运行阶段监控容器内进程对镜像各分层数据的读写模式,经统计分析得到该容器的热点数据,并调用扩展下载组件将该热点数据下载到本地存储,以用于容器在二次启动之后运行时会使用。Further, the extended container daemon process includes an extended storage driver component, the storage driver component is called before the container is started, and is responsible for obtaining the layered data required by the image from the local image layered storage according to the image ID used by the container. , and use the layered data to build the root file system of the container in an overlay manner; and is responsible for monitoring the read and write modes of the process in the container to the layered data of the image during the container running phase, obtain the hotspot data of the container through statistical analysis, and call The extension download component downloads the hotspot data to local storage for use when the container runs after the second startup.

本发明与现有技术相比的优点在于:The advantages of the present invention compared with the prior art are:

本发明针对现有无服务器平台使用原生容器作为运行时,容器冷启动时间长,对其他应用干扰时间长以及带宽利用率低的问题,将容器镜像数据存储到多节点共享的存储中心,同时改进镜像下载实现,只建立相应分层数据的引用,从而将实际数据的传输推迟到容器进程执行时期,不但提高了无服务器架构平台的运行稳定性,也进一步提高了容器的冷启动效率。Aiming at the problems that the existing serverless platform uses the native container as the runtime, the cold start time of the container is long, the interference time to other applications is long, and the bandwidth utilization rate is low, the invention stores the container image data in the storage center shared by multiple nodes, and improves the The image download implementation only establishes the reference of the corresponding layered data, thereby delaying the actual data transmission until the container process execution period, which not only improves the operational stability of the serverless architecture platform, but also further improves the cold start efficiency of the container.

附图说明Description of drawings

为了更清楚的说明本发明实施例的工作方式和现有技术的技术方案,下面将本发明现有技术的附图作一个简单说明:In order to more clearly illustrate the working mode of the embodiments of the present invention and the technical solutions of the prior art, a brief description of the accompanying drawings of the prior art of the present invention will be made below:

图1为实施例的基于懒加载机制的容器镜像拉取系统的结构示意图;1 is a schematic structural diagram of a container image pulling system based on a lazy loading mechanism according to an embodiment;

图2为实施例的基于懒加载机制的容器镜像拉取系统的扩展下载组件的流程示意图;2 is a schematic flowchart of an extension download component of the container image pulling system based on the lazy loading mechanism of the embodiment;

图3为实施例的基于懒加载机制的容器镜像拉取系统的扩展存储驱动流程示意图。FIG. 3 is a schematic diagram of an extended storage driver process of a container image pulling system based on a lazy loading mechanism according to an embodiment.

具体实施方式Detailed ways

下面将结合本发明的附图,进一步说明本发明的技术方案,所描述的实施例是本发明一部分实施例,而不代表全部的实施例。对于本领域的技术人员来说,一些公知技术可能未进行详细阐述。The technical solutions of the present invention will be further described below with reference to the accompanying drawings of the present invention. The described embodiments are part of the embodiments of the present invention, but not all of the embodiments. For those skilled in the art, some well-known technologies may not be described in detail.

一种基于懒加载机制的容器镜像拉取系统,包括共享数据存储端、远程容器镜像中心端和电子装置,该电子装置可以概括为运行客户服务端和扩展容器守护进程;A container image pulling system based on a lazy loading mechanism, including a shared data storage end, a remote container image center end, and an electronic device, the electronic device can be summarized as running a client server and extending a container daemon;

客户服务为前台的交互服务,服务的主体可以是具体的人也可以是应用,当客户对某种服务资源有需求时,客户服务端向容器守护进程发起包含业务逻辑的请求;Customer service is an interactive service in the foreground. The main body of the service can be a specific person or an application. When a customer has a demand for a certain service resource, the customer server sends a request containing business logic to the container daemon;

容器守护进程为核心部分,扩展后的容器守护进程包含优化过后的镜像数据下载组件,以及存储驱动组件。守护进程与客户服务端之间的交互以远程过程调用的形式进行,扩展后的容器守护进程可以接受来自客户服务端的从共享数据存储端拉取镜像数据的请求,请求中包含待下载的镜像名称,通过修改原有的容器守护进程,本系统支持扩展的容器镜像格式,若镜像以扩展形式提交,则该容器会优先以懒加载形式拉取,容器守护进程会首先尝试与远程容器镜像中心建立连接,并获取与该镜像相关的元数据。优化后的镜像数据下载组件会利用元数据从共享数据存储端下载镜像数据,若镜像以默认形式提交或者懒加载模式下载失败,该下载组件会回退到常规的下载流程,即从容器镜像中心端进行下载,保证服务的鲁棒性。使用改进方式下载的镜像分层数据仅包含对原分层数据的引用。实际数据的传输被推迟到了容器内应用的启动时刻。当容器守护进程接收到启动容器的命令之后,会调用扩展存储驱动组件建立容器的根文件系统,存储驱动组件使用下载完成的镜像数据建立容器的根目录。容器在运行过程中所需的数据会以按需的方式被加载到本地节点中,提高容器的启动速度。存储驱动组件会监控容器内应用的执行,收集并分析得到容器热点数据,并调用下载组件将相应热点数据提前加载到本地节点。The container daemon is the core part. The extended container daemon includes the optimized image data download component and the storage driver component. The interaction between the daemon and the client server is carried out in the form of remote procedure calls. The extended container daemon can accept requests from the client server to pull image data from the shared data storage, and the request contains the name of the image to be downloaded. , By modifying the original container daemon process, the system supports the extended container image format. If the image is submitted in an extended form, the container will be pulled in the form of lazy loading first, and the container daemon process will first try to establish with the remote container image center. connect, and get metadata related to that image. The optimized image data download component will use metadata to download image data from the shared data storage. If the image is submitted in the default form or the download fails in lazy loading mode, the download component will fall back to the regular download process, that is, from the container image center. download from the terminal to ensure the robustness of the service. Image layered data downloaded using the improved method contains only references to the original layered data. The actual data transfer is deferred until the application startup time inside the container. When the container daemon receives the command to start the container, it will call the extended storage driver component to establish the root file system of the container, and the storage driver component uses the downloaded image data to establish the root directory of the container. The data required by the container during the running process will be loaded into the local node in an on-demand manner, which improves the startup speed of the container. The storage driver component monitors the execution of applications in the container, collects and analyzes container hotspot data, and calls the download component to load the corresponding hotspot data to the local node in advance.

共享数据存储端,负责存储未经压缩的镜像分层数据,并为多个节点提供共享的数据存储服务。扩展下载组件需要从共享数据存储端查询并拉取镜像分层数据。The shared data storage side is responsible for storing uncompressed image layered data and providing shared data storage services for multiple nodes. The extension download component needs to query and pull the image layered data from the shared data storage side.

远程容器镜像中心端,负责存储镜像元数据,同时也存储一部分共享数据存储端不存在的经过压缩的镜像数据。The remote container image center is responsible for storing image metadata, as well as some compressed image data that does not exist on the shared data storage end.

所述扩展容器守护进程包括扩展下载组件,该扩展下载组件说明如下:The extension container daemon process includes an extension download component, and the extension download component is described as follows:

扩展下载组件接受来自容器守护进程的镜像分层数据下载请求,并与远程容器镜像中心端通信,获取镜像的元数据,镜像元数据中包括某一镜像所有分层数据的唯一DiffID(镜像层校验ID),利用各个分层数据的DiffID,可以计算出分层的ChainID值(即docker内容寻址机制采用的索引ID),而利用该值,扩展下载组件在共享数据存储端搜索相应的分层数据,并在本地建立分层数据的软链接,同时设置本地分层数据的相关元数据参数,包括分层大小、CacheID(即由宿主机生成的通用唯一识别码)等信息,以保证容器原有镜像存储服务的正常运行。扩展下载组件在下载镜像分层数据时并未将全量数据下载到本地,而是在本地节点制作分层数据的引用,直到容器启动时才会有实际数据的传输。扩展下载组件在接收到从共享数据存储端下载镜像的请求后会在数据存储中查找符合要求的分层数据,并制作软链接存储到本地。The extended download component accepts the image layered data download request from the container daemon, and communicates with the remote container image center to obtain the image metadata. The image metadata includes the unique DiffID (image layer calibration) of all layered data of an image. Check ID), using the DiffID of each layered data, the layered ChainID value (that is, the index ID adopted by the docker content addressing mechanism) can be calculated, and using this value, the extended download component searches the shared data storage for the corresponding Layer data, and establish a soft link of layered data locally, and set relevant metadata parameters of local layered data, including layer size, CacheID (that is, the universal unique identification code generated by the host) and other information to ensure that the container The normal operation of the original image storage service. The extended download component does not download the full amount of data locally when downloading the image layered data, but makes a reference to the layered data on the local node, and the actual data will not be transmitted until the container is started. After receiving the request to download the image from the shared data storage, the extension download component will look for the required hierarchical data in the data storage, and make a soft link to store it locally.

当扩展下载组件无法从共享数据存储端下载镜像数据时,会回退到容器默认的镜像下载流程,即调用容器默认的镜像下载组件从远程容器镜像中心中下载镜像数据。回退执行的结果会被返回给容器守护进程以及客户服务端。When the extension download component cannot download the image data from the shared data storage, it will fall back to the container's default image download process, that is, call the container's default image download component to download the image data from the remote container image center. The result of the rollback execution will be returned to the container daemon and the client server.

所述扩展容器守护进程包括扩展存储驱动组件,该存储驱动组件说明如下:The extended container daemon includes an extended storage driver component, and the storage driver component is described as follows:

扩展存储驱动在容器启动之前被调用,它根据容器所使用的镜像ID从本地的镜像分层存储中获取该镜像所需的分层数据,此处的镜像ID不同于与分层相关的ID,镜像ID是某一容器镜像所独有的标识符,唯一指代某一镜像,而镜像则包含了多个分层。扩展存储驱动随后利用分层数据以Overlay的方式构建容器的根文件系统。由于镜像数据以软链接的形式存储在本地,容器运行过程中所需的数据会以按需的方式从共享数据存储端被读取到本地。The extended storage driver is called before the container starts. It obtains the layered data required by the image from the local image layered storage according to the image ID used by the container. The image ID here is different from the layer-related ID. Image ID is a unique identifier for a container image, which uniquely refers to an image, and an image contains multiple layers. The extended storage driver then uses the hierarchical data to build the container's root file system in an overlay manner. Since the image data is stored locally in the form of a soft link, the data required during the running of the container will be read from the shared data storage end to the local on an as-needed basis.

扩展存储驱动会在容器运行阶段监控容器内进程对各分层数据的读写模式,经统计分析得到该容器的热点数据,并调用扩展下载组件将这些热点数据或热点分层数据预下载到本地存储,容器在二次启动之后的运行会使用预先加载的部分数据,提高程序运行的效率。The extended storage driver will monitor the read/write mode of each layered data by the process in the container during the container running phase, obtain the hotspot data of the container through statistical analysis, and call the extension download component to pre-download these hotspot data or hotspot layered data to the local Storage, the operation of the container after the second startup will use some of the pre-loaded data to improve the efficiency of program operation.

本实施例中,一种基于懒加载机制的容器镜像拉取系统,如图1所示,其步骤如下:In this embodiment, a system for pulling container images based on a lazy loading mechanism is shown in FIG. 1 , and the steps are as follows:

步骤101:客户服务端,可以是用户或者上层应用向容器守护进程发起镜像下载的请求,且需要从共享数据存储端进行下载;客户服务端,可以是用户或者上层应用向容器守护进程发起启动容器的请求,且使用的镜像是从共享数据存储端下载的。Step 101: The client server, which can be a user or an upper-layer application that initiates a request to download an image to the container daemon, and needs to be downloaded from the shared data storage end; the client and server, which can be a user or an upper-layer application that initiates the container daemon to start the container request, and the image used is downloaded from the shared data store.

步骤102:容器守护进程对扩展下载组件以及扩展存储驱动所提供的接口进行统一封装,对客户服务端提供与其他默认组件类似的服务接口;Step 102: the container daemon uniformly encapsulates the interface provided by the extended download component and the extended storage driver, and provides a service interface similar to other default components for the client server;

步骤103:扩展下载组件与远程容器镜像中心端进行通信,使用镜像的名称以及版本信息等参数得到镜像的元数据信息,元数据信息中包含镜像的分层信息;Step 103: the extension download component communicates with the remote container image center, and uses the image name and version information and other parameters to obtain the metadata information of the image, and the metadata information includes the layered information of the image;

步骤104:扩展下载组件使用镜像分层信息到共享数据存储端对分层数据进行查询,若找到相应的分层数据,则建立相关数据的软链接并存储到本地;Step 104: the extended download component uses the image layered information to query the layered data to the shared data storage end, and if the corresponding layered data is found, a soft link of the relevant data is established and stored locally;

步骤105:扩展存储驱动在容器运行期间监控容器数据读写,并调用扩展下载组件下载热点数据。Step 105: The extended storage driver monitors the read and write of container data during the running of the container, and calls the extended download component to download hot data.

本实施例中,一种基于懒加载机制的容器镜像拉取系统扩展下载组件的流程如图2所示,具体实施步骤如下:In this embodiment, a process of extending and downloading components of a container image pulling system based on a lazy loading mechanism is shown in Figure 2, and the specific implementation steps are as follows:

步骤201:客户提交一个下载镜像数据的请求,容器守护进程对镜像名称进行解析以确定是否优先从共享存储中进行下载;Step 201: the client submits a request for downloading image data, and the container daemon parses the image name to determine whether to download from the shared storage preferentially;

步骤202:若需要优先从共享存储中进行下载,则容器守护进程会调用扩展下载组件,扩展下载组件首先尝试从远程容器镜像中心端获取镜像元数据,根据镜像层的打包文件校验取得镜像分层DiffID;Step 202: If it is necessary to download from the shared storage preferentially, the container daemon will call the extended download component. The extended download component first tries to obtain the image metadata from the remote container image center, and obtains the image score according to the package file verification of the image layer. layerDiffID;

步骤203:使用获取到的镜像分层DiffID计算各分层的ChainID,即根据当前层和所有祖先层的DiffID计算得到,其计算方式为:Step 203: Calculate the ChainID of each layer by using the obtained DiffID of the mirror layer, that is, to calculate the DiffID of the current layer and all ancestor layers, and the calculation method is:

ChainID(layerN)=SHA256hex(ChainID(layerN-1)+""+DiffID(layerN));ChainID(layerN)=SHA256hex(ChainID(layerN-1)+""+DiffID(layerN));

步骤204:使用计算得到的ChainID信息,到共享数据存储端查找相同ChainID的分层数据;Step 204: Use the calculated ChainID information to search for the hierarchical data of the same ChainID at the shared data storage end;

步骤205:若数据存在,则设置本地软链接,并设置分层数据大小、CacheID等元信息,CacheID为由宿主机生成的通用唯一识别码(UUID),与镜像层文件一一对应,用于宿主机和索引镜像层文件;Step 205: If the data exists, set a local soft link, and set meta-information such as hierarchical data size, CacheID, etc. CacheID is a Universal Unique Identifier (UUID) generated by the host, which corresponds to the image layer file one-to-one, and is used for Host and index image layer files;

步骤206:若镜像名称指明直接采用默认下载方式或者扩展下载组件下载过程中出现任何错误便回退到默认下载流程,默认下载组件从远程容器镜像中心端下载镜像全量数据;Step 206: If the image name indicates that the default download method is directly adopted or any error occurs during the download process of the extended download component, it will fall back to the default download process, and the default download component will download the full amount of image data from the remote container image center;

步骤207:将下载组件的下载结果或者错误信息返回给容器守护进程进而返回给客户服务端。Step 207: Return the download result or error information of the download component to the container daemon and then to the client server.

本实施例中一种基于懒加载机制的容器镜像拉取系统的扩展存储驱动的流程如图3所示:In this embodiment, the process of extending the storage driver of the container image pulling system based on the lazy loading mechanism is shown in FIG. 3 :

步骤301:扩展存储驱动组件获得来自容器守护进程即客户服务端启动容器的请求,存储驱动组件根据请求为容器构建根目录系统;Step 301: the extended storage driver component obtains a request from the container daemon, that is, the client server to start the container, and the storage driver component builds a root directory system for the container according to the request;

步骤302:容器启动成功后会触发容器监控单元的启动;Step 302: After the container is successfully started, the start of the container monitoring unit will be triggered;

步骤303:容器监控单元收集容器运行期间对数据的读取信息,包括文件名、文件大小等;Step 303: The container monitoring unit collects data read information during container operation, including file name, file size, etc.;

步骤304:容器监控单元将容器运行信息传递给数据分析单元;Step 304: The container monitoring unit transmits the container operation information to the data analysis unit;

步骤305:数据分析单元将分析处理之后的数据返回给容器监控单元,其中包含容器读取的热点数据;Step 305: the data analysis unit returns the analyzed and processed data to the container monitoring unit, which includes the hotspot data read by the container;

步骤306:容器监控单元将热点数据信息进一步返回给扩展存储驱动组件;Step 306: the container monitoring unit further returns the hotspot data information to the extended storage drive component;

步骤307:组件返回构建容器根目录系统调用结果;Step 307: the component returns the system call result of building the container root directory;

步骤308:扩展存储驱动组件根据返回的热点数据调用扩展下载组件对相应热点数据进行下载。Step 308: The extended storage driving component invokes the extended download component to download the corresponding hotspot data according to the returned hotspot data.

本发明使用一些常用的容器镜像Nginx、Redis、Postgres及Couchbase,对传统部署系统和本发明系统进行对比测试,记录两者容器的启动时间,其测试结果如下表1所示。The present invention uses some commonly used container images Nginx, Redis, Postgres and Couchbase to compare the traditional deployment system and the system of the present invention, and record the startup time of the two containers. The test results are shown in Table 1 below.

表1镜像部署时间Table 1 Image deployment time

NginxNginx RedisRedis PostgresPostgres CouchbaseCouchbase 传统部署系统traditional deployment system 4.622s4.622s 4.178s4.178s 11.121s11.121s 32.754s32.754s 本系统this system 0.516s0.516s 0.757s0.757s 0.829s0.829s 0.856s0.856s

由表1可以看到,本系统相较于传统的部署方式,在平均部署时间上有80%以上的提升。As can be seen from Table 1, compared with the traditional deployment method, the system has an improvement of more than 80% in the average deployment time.

以上对于本发明实施例的具体实施方法的描述只是举例说明,本发明的保护范围由所述权利要求书阐述。本领域的技术人员在理解了上述说明的基础之上,进行的任何形式的变化和改动均落入本发明的保护范围之内。The above descriptions of the specific implementation methods of the embodiments of the present invention are only examples, and the protection scope of the present invention is described by the claims. On the basis of understanding the above description by those skilled in the art, any changes and modifications made in any form fall within the protection scope of the present invention.

Claims (10)

1.一种基于懒加载机制的容器镜像拉取方法,其特征在于,包括以下步骤:1. A method for pulling a container image based on a lazy loading mechanism, characterized in that, comprising the following steps: 目标电子装置的扩展容器守护进程根据镜像下载请求,获取镜像名称,解析镜像名称,选择镜像是以懒加载形式下载还是以常规形式下载;The extension container daemon of the target electronic device obtains the image name according to the image download request, parses the image name, and selects whether to download the image in a lazy loading form or in a regular form; 如果以懒加载形式下载镜像,则从远程容器镜像中心端获取镜像的元数据,根据元数据在本地建立与共享数据存储端中的镜像分层数据的软链接;在容器启动时,通过该软链接,从共享数据存储端下载所需的镜像分层数据;If the image is downloaded in the form of lazy loading, the metadata of the image is obtained from the remote container image center, and a soft link with the image layered data in the shared data storage is established locally according to the metadata; link to download the required image layered data from the shared data storage; 如果以常规形式下载镜像或者以懒加载形式下载镜像失败,则从远程容器镜像中心端下载镜像全量数据。If downloading the image in the regular form or in the lazy loading form fails, download the full image data from the remote container image center. 2.如权利要求1所述的方法,其特征在于,扩展容器守护进程包括扩展下载组件,该扩展下载组件负责根据镜像下载请求,从远程容器镜像中心端获取镜像的元数据,根据该元数据中的镜像分层数据ID建立与共享数据存储端中的镜像分层数据的软链接;在容器启动时,通过该软链接从共享数据存储端下载所需的镜像分层数据。2. The method according to claim 1, wherein the extended container daemon process comprises an extended download component, and the extended download component is responsible for obtaining the metadata of the image from the remote container image center according to the image download request, and according to the metadata The image layered data ID in establishes a soft link with the image layered data in the shared data storage end; when the container is started, the required image layered data is downloaded from the shared data storage end through the soft link. 3.如权利要求1或2所述的方法,其特征在于,根据元数据在本地建立与共享数据存储端中的镜像分层数据的软链接,是指元数据中包括镜像所有分层数据的唯一DiffID,利用各个分层数据的DiffID计算出分层的ChainID值,利用ChainID值在共享数据存储端搜索相应的镜像分层数据,并在本地建立与该镜像分层数据的软链接。3. The method according to claim 1 or 2, characterized in that, according to the metadata, a soft link with the mirrored layered data in the shared data storage terminal is established locally, which means that the metadata includes a mirror image of all layered data. Unique DiffID, using the DiffID of each layered data to calculate the layered ChainID value, using the ChainID value to search for the corresponding mirror layered data in the shared data storage, and establish a local soft link with the mirror layered data. 4.如权利要求1所述的方法,其特征在于,扩展容器守护进程包括扩展存储驱动组件,所述存储驱动组件在容器启动之前被调用,负责根据容器所使用的镜像ID从本地的镜像分层存储中获取该镜像所需的分层数据,并利用分层数据以Overlay的方式构建容器的根文件系统;并负责在容器运行阶段监控容器内进程对镜像各分层数据的读写模式,经统计分析得到该容器的热点数据,并调用扩展下载组件将该热点数据下载到本地存储,以用于容器在二次启动之后运行时会使用。4. The method according to claim 1, wherein extending the container daemon process comprises an extended storage driver component, the storage driver component is called before the container is started, and is responsible for extracting files from the local image according to the image ID used by the container. Obtain the layered data required by the image in the layer storage, and use the layered data to build the root file system of the container in an overlay mode; and is responsible for monitoring the read and write modes of the process in the container to the layered data of the image during the container running phase, After statistical analysis, the hotspot data of the container is obtained, and the extension download component is called to download the hotspot data to the local storage for use when the container runs after the second startup. 5.一种基于懒加载机制的容器镜像拉取系统,其特征在于,包括共享数据存储端、远程容器镜像中心端和作为本地节点的电子装置;其中,5. A container image pulling system based on a lazy loading mechanism, characterized in that it comprises a shared data storage end, a remote container image central end, and an electronic device serving as a local node; wherein, 共享数据存储端,负责存储未经压缩的镜像分层数据,并为本地节点提供共享的数据存储服务;The shared data storage end is responsible for storing uncompressed image layered data and providing shared data storage services for local nodes; 远程容器镜像中心端,负责存储镜像元数据以及经过压缩的镜像数据;The remote container image center is responsible for storing image metadata and compressed image data; 电子装置,运行扩展容器守护进程,该进程根据镜像下载请求,获取镜像名称,解析镜像名称,选择镜像是以懒加载形式下载还是以常规形式下载;如果以懒加载形式下载镜像,则从远程容器镜像中心端获取镜像的元数据,根据元数据在本地建立与共享数据存储端中的镜像分层数据的软链接;在容器启动时,通过该软链接从共享数据存储端下载所需的镜像分层数据;如果以常规形式下载镜像或者以懒加载形式下载镜像失败,则从远程容器镜像中心端下载镜像全量数据。The electronic device runs the extended container daemon process, which obtains the image name according to the image download request, parses the image name, and selects whether the image should be downloaded in lazy loading form or in regular form; if the image is downloaded in lazy loading form, it will be downloaded from the remote container The image center obtains the metadata of the image, and establishes a local soft link with the image layered data in the shared data storage according to the metadata; when the container starts, it downloads the required image files from the shared data storage through the soft link. Layer data; if downloading the image in the regular form or in the lazy loading form fails, download the full image data from the remote container image center. 6.如权利要求5所述的系统,其特征在于,扩展容器守护进程包括扩展下载组件,该扩展下载组件负责根据镜像下载请求,从远程容器镜像中心端获取镜像的元数据,根据该元数据中的镜像分层数据ID建立与共享数据存储端中的镜像分层数据的软链接;在容器启动时,通过该软链接从共享数据存储端下载所需的镜像分层数据。6. The system according to claim 5, wherein the extended container daemon process comprises an extended download component, and the extended download component is responsible for obtaining the metadata of the image from the remote container image center according to the image download request, and according to the metadata The image layered data ID in establishes a soft link with the image layered data in the shared data storage end; when the container is started, the required image layered data is downloaded from the shared data storage end through the soft link. 7.如权利要求5或6所述的系统,其特征在于,根据元数据在本地建立与共享数据存储端中的镜像分层数据的软链接,是指元数据中包括镜像所有分层数据的唯一DiffID,利用各个分层数据的DiffID计算出分层的ChainID值,利用ChainID值在共享数据存储端搜索相应的镜像分层数据,并在本地建立与该镜像分层数据的软链接。7. The system according to claim 5 or 6, characterized in that, establishing a soft link with the mirrored layered data in the shared data storage terminal locally according to the metadata means that the metadata includes a mirror image of all layered data. Unique DiffID, using the DiffID of each layered data to calculate the layered ChainID value, using the ChainID value to search for the corresponding mirror layered data in the shared data storage, and establish a local soft link with the mirror layered data. 8.如权利要求7所述的系统,其特征在于,在本地建立与该镜像分层数据的软链接的同时,设置所获取的镜像元数据参数,参数包括分层大小、CacheID信息,以保证容器原有镜像存储服务的正常运行。8. system as claimed in claim 7, is characterized in that, while establishing the soft link with this mirror layered data locally, set the mirror metadata parameter that obtains, parameter comprises layer size, CacheID information, to ensure The normal operation of the original image storage service of the container. 9.如权利要求5所述的系统,其特征在于,扩展容器守护进程调用容器默认的镜像下载组件从远程容器镜像中心端下载镜像全量数据。9 . The system of claim 5 , wherein the extended container daemon process calls a default image download component of the container to download the full amount of image data from the remote container image center. 10 . 10.如权利要求5所述的系统,其特征在于,扩展容器守护进程包括扩展存储驱动组件,所述存储驱动组件在容器启动之前被调用,负责根据容器所使用的镜像ID从本地的镜像分层存储中获取该镜像所需的分层数据,并利用分层数据以Overlay的方式构建容器的根文件系统;并负责在容器运行阶段监控容器内进程对镜像各分层数据的读写模式,经统计分析得到该容器的热点数据,并调用扩展下载组件将该热点数据下载到本地存储,以用于容器在二次启动之后运行时会使用。10. The system according to claim 5, wherein the extended container daemon process comprises an extended storage driver component, the storage driver component is called before the container is started, and is responsible for extracting files from the local image according to the image ID used by the container. Obtain the layered data required by the image in the layer storage, and use the layered data to build the root file system of the container in an overlay mode; and is responsible for monitoring the read and write modes of the process in the container to the layered data of the image during the container running phase, After statistical analysis, the hotspot data of the container is obtained, and the extension download component is called to download the hotspot data to the local storage for use when the container runs after the second startup.
CN202010338603.8A 2020-04-26 2020-04-26 Container image pulling method and system based on lazy loading mechanism Pending CN111708656A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010338603.8A CN111708656A (en) 2020-04-26 2020-04-26 Container image pulling method and system based on lazy loading mechanism

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010338603.8A CN111708656A (en) 2020-04-26 2020-04-26 Container image pulling method and system based on lazy loading mechanism

Publications (1)

Publication Number Publication Date
CN111708656A true CN111708656A (en) 2020-09-25

Family

ID=72536349

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010338603.8A Pending CN111708656A (en) 2020-04-26 2020-04-26 Container image pulling method and system based on lazy loading mechanism

Country Status (1)

Country Link
CN (1) CN111708656A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114327754A (en) * 2021-12-15 2022-04-12 中电信数智科技有限公司 Mirror image exporting and assembling method based on container layering technology
CN114691299A (en) * 2022-03-22 2022-07-01 浪潮云信息技术股份公司 Serverless-based edge computing resource management system
WO2022206722A1 (en) * 2021-04-01 2022-10-06 华为云计算技术有限公司 Container application starting method, image management method, and related devices
US12105733B2 (en) 2022-10-25 2024-10-01 Red Hat, Inc. Chunk aware image locality scores for container images in multi-node clusters

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106506587A (en) * 2016-09-23 2017-03-15 中国人民解放军国防科学技术大学 A Docker image download method based on distributed storage
CN110837408A (en) * 2019-09-16 2020-02-25 中国科学院软件研究所 A high-performance serverless computing method and system based on resource caching

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106506587A (en) * 2016-09-23 2017-03-15 中国人民解放军国防科学技术大学 A Docker image download method based on distributed storage
CN110837408A (en) * 2019-09-16 2020-02-25 中国科学院软件研究所 A high-performance serverless computing method and system based on resource caching

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022206722A1 (en) * 2021-04-01 2022-10-06 华为云计算技术有限公司 Container application starting method, image management method, and related devices
CN114327754A (en) * 2021-12-15 2022-04-12 中电信数智科技有限公司 Mirror image exporting and assembling method based on container layering technology
CN114327754B (en) * 2021-12-15 2022-10-04 中电信数智科技有限公司 Mirror image exporting and assembling method based on container layering technology
CN114691299A (en) * 2022-03-22 2022-07-01 浪潮云信息技术股份公司 Serverless-based edge computing resource management system
US12105733B2 (en) 2022-10-25 2024-10-01 Red Hat, Inc. Chunk aware image locality scores for container images in multi-node clusters

Similar Documents

Publication Publication Date Title
CN111708656A (en) Container image pulling method and system based on lazy loading mechanism
US12197758B2 (en) Distributed object replication architecture
US11010240B2 (en) Tracking status and restarting distributed replication
US7305424B2 (en) Manipulation of zombie files and evil-twin files
US20190245918A1 (en) Distributed replication of an object
US5890159A (en) Data transfer mechanism between databases using a separate pipe at each database
US20190243688A1 (en) Dynamic allocation of worker nodes for distributed replication
US20210160319A1 (en) Data Sending Method and Apparatus, and Data Receiving Method and Apparatus
US20040010654A1 (en) System and method for virtualizing network storages into a single file system view
US20030195951A1 (en) Method and system to dynamically detect, download and install drivers from an online service
CN105872016A (en) Operation method of virtual machine in desktop cloud
CN102710763B (en) The method and system of a kind of distributed caching pond, burst and Failure Transfer
CN107689976A (en) A kind of document transmission method and device
CN112073240A (en) Blue-green deployment system and method based on registration center component and storage medium
CN114116031A (en) Method and device for synchronizing option parameters, computer equipment and storage medium
US7711539B1 (en) System and method for emulating SCSI reservations using network file access protocols
CN113051102B (en) File backup method, device, system, storage medium and computer equipment
CN107181773A (en) Data storage and data managing method, the equipment of distributed memory system
CN111414239B (en) Virtual machine mirror image management method, system and medium based on kylin cloud computing platform
US8463871B1 (en) Method and system for data backup with capacity and traffic optimization
CN112052234B (en) Service data processing method and device, storage medium and electronic device
CN111104367B (en) Method for creating private mirror image based on openstack volume starting virtual machine
CN116594551A (en) Data storage method and device
CN115729889A (en) Data access method, database system and storage device
WO2024066904A1 (en) Container creation method, system, and node

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20200925

RJ01 Rejection of invention patent application after publication