[go: up one dir, main page]

CN117193656A - A data hierarchical storage migration process method - Google Patents

A data hierarchical storage migration process method Download PDF

Info

Publication number
CN117193656A
CN117193656A CN202311232833.6A CN202311232833A CN117193656A CN 117193656 A CN117193656 A CN 117193656A CN 202311232833 A CN202311232833 A CN 202311232833A CN 117193656 A CN117193656 A CN 117193656A
Authority
CN
China
Prior art keywords
data
migration
storage
level
storage device
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311232833.6A
Other languages
Chinese (zh)
Inventor
吴洪桥
马立广
张敬波
何维
曹彦荣
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ministry Of Natural Resources Information Center
Original Assignee
Ministry Of Natural Resources Information Center
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ministry Of Natural Resources Information Center filed Critical Ministry Of Natural Resources Information Center
Publication of CN117193656A publication Critical patent/CN117193656A/en
Pending legal-status Critical Current

Links

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本发明提供一种数据分级存储迁移流程方法,涉及数据存储技术领域;本发明通过基于多级存储设备进行数据存储迁移过程,实现多级存储设备中的数据在各级存储设备之间的动态管理;建立一种海量数据分级存储的标准、规范的流程管理方法,解决海量数据在分级存储设备管理中的主观性、随意性等不合理因素,实现海量数据在生命周期内在在线、近线和离线存储资源中的高效组织与动态存储管理,提升数据安全保障和服务效率。

The present invention provides a data hierarchical storage migration process method, which relates to the field of data storage technology; the present invention realizes dynamic management of data in multi-level storage devices between various levels of storage devices by performing a data storage migration process based on multi-level storage devices. ; Establish a standard and standardized process management method for hierarchical storage of massive data, solve the unreasonable factors such as subjectivity and randomness in the management of hierarchical storage devices of massive data, and realize the online, near-line and offline operation of massive data within the life cycle. Efficient organization and dynamic storage management of storage resources improve data security and service efficiency.

Description

一种数据分级存储迁移流程方法A data hierarchical storage migration process method

技术领域Technical field

本发明涉及信息技术领域,尤其涉及一种数据分级存储迁移流程方法。The present invention relates to the field of information technology, and in particular to a data hierarchical storage migration process method.

背景技术Background technique

目前,随着信息技术在各行各业不断深入应用,积累的数据量也越来越多,各种不同性能和应用级别的存储设备都大量配置到了数据相对集中的行业或部门的数据中心,海量数据存储管理成为一些行业和部门信息化工作面临的瓶颈问题。At present, with the continuous in-depth application of information technology in all walks of life, the amount of accumulated data is also increasing. A large number of storage devices with different performance and application levels are deployed in data centers of industries or departments with relatively concentrated data. Massive amounts of Data storage management has become a bottleneck problem facing the informatization work of some industries and departments.

其中一个突出的表现就是各类数据的存储分级管理模糊:数据在不同级别存储设备上的迁入和迁出等动态存储管理缺乏一个标准和科学的流程管理方法,基于全生命周期的数据存储管理流程不科学、不完善。上述问题造成数据在存储管理中出现一些主观性、随意性因素,这些因素也造成了数据管理的安全问题、数据服务的效率问题,从而整体影响了数据服务能力和应用价值的提升。因此,各行业信息化建设在提供数据管理与服务同时,亟需建立科学有效的存储管理流程方法,充分利用在线、近线和离线的分级存储资源,做好各类数据生命周期的动态存储管理,保障数据安全,提高数据管理与服务效率。One of the outstanding manifestations is the vague storage hierarchical management of various types of data: dynamic storage management such as data migration in and out of different levels of storage devices lacks a standard and scientific process management method, which is based on full life cycle data storage management. The process is unscientific and imperfect. The above problems have caused some subjective and arbitrary factors in data storage management. These factors have also caused security issues in data management and efficiency issues in data services, thus overall affecting the improvement of data service capabilities and application value. Therefore, while providing data management and services, information construction in various industries urgently needs to establish scientific and effective storage management process methods, make full use of online, near-line and offline hierarchical storage resources, and do a good job in dynamic storage management of various data life cycles. , ensure data security and improve data management and service efficiency.

发明内容Contents of the invention

本发明的目的在于提供一种数据分级存储迁移流程方法,从而解决现有技术中存在的前述问题。The purpose of the present invention is to provide a data hierarchical storage migration process method, thereby solving the aforementioned problems existing in the prior art.

为了实现上述目的,本发明采用的技术方案如下:In order to achieve the above objects, the technical solutions adopted by the present invention are as follows:

一种数据分级存储迁移流程方法,基于多级存储设备进行数据存储迁移过程,实现多级存储设备中的数据在各级存储设备之间的动态管理;其中,所述多级存储设备的存储分级是指依据设备访问方式、访问性能和设备安全性不同划分的存储类别,由高到低分为第一级存储设备、第二级存储设备,…,第M级存储设备;所述存储迁移过程包括数据升迁过程和数据降迁过程,所述数据升迁过程指的是低级存储设备中的数据按照升迁规则迁移到更高级别的存储设备中,所述数据降迁过程指的是高级存储设备中的数据按照降迁规则迁移到较低级存储设备中;A data hierarchical storage migration process method, which performs a data storage migration process based on multi-level storage devices to realize dynamic management of data in multi-level storage devices between storage devices at all levels; wherein, the storage hierarchies of the multi-level storage devices It refers to the storage categories divided according to different device access methods, access performance and device security. From high to low, it is divided into first-level storage devices, second-level storage devices,..., M-th level storage devices; the storage migration process It includes a data upgrading process and a data downgrading process. The data upgrading process refers to the migration of data in low-level storage devices to higher-level storage devices according to the upgrading rules. The data downgrading process refers to the data in high-level storage devices. The data is migrated to lower-level storage devices according to the downgrade rules;

所述数据升迁过程包括基于用户需求的升迁过程,具体包括以下步骤:The data upgrading process includes an upgrading process based on user needs, specifically including the following steps:

S1,对于用户访问数据X,确认数据X的具体位置,若位于最高级存储设备中,则不做迁移,直接进入步骤S5;若位于降迁队列中,则进入步骤S2,若位于较低级存储设备中,则进入步骤S3;若位于升迁队列中,则进入步骤S4;S1, for user access to data X, confirm the specific location of data If it is in the storage device, go to step S3; if it is in the promotion queue, go to step S4;

S2,确认数据X降迁过程是否已完成,若已完成,则进入步骤S3;若未完成迁移,则停止迁移并移出降迁队列,进入步骤S5,将数据X保留在高级存储设备中;S2, confirm whether the downgrading process of data

S3,重新采用价值评估方法对数据X进行评估,调整较低级存储设备中的数据价值排序,将数据X的置于升迁队列前部进行数据升迁过程,待数据升迁过程结束,数据X进入高级存储设备中,进入步骤S5;S3, re-use the value evaluation method to evaluate data X, adjust the data value sorting in lower-level storage devices, place data In the storage device, enter step S5;

S4,确认数据X是否已完成数据升迁过程,若未迁移,则将数据X置于升迁队列前部,继续迁移过程;若处于迁移中,则加快进行升迁过程,直至迁移完成后进入步骤S5;若已迁移完,则直接进入步骤S5;S4, confirm whether data X has completed the data upgrading process. If it has not been migrated, place data If the migration has been completed, proceed directly to step S5;

S5,数据X保留在高级存储设备中;S5, data X remains in a premium storage device;

所述数据价值评估方法具体包括通过公式(1)进行价值评定:The data value assessment method specifically includes value assessment through formula (1):

其中,IP(X)表示数据文件X的固有属性,S表示文件大小,T表示文件访问时间,F为数据读写频率,C为文件的访问用户数量,R为数据关联度,Q为存储成本以及迁移成本。Among them, IP(X) represents the inherent attributes of the data file and migration costs.

优选的,所述数据降迁过程包括主动降迁和被迫降迁,其中所述主动降迁包括以下步骤:Preferably, the data migration process includes active migration and forced migration, wherein the active migration includes the following steps:

当较高级别的存储设备中的部分存储数据的价值低于较低级别存储设备中部分存储数据的价值时,触发迁移条件,按照降迁规则进行降迁操作,直至较高级别存储设备中所有数据价值均高于较低级别存储设备中的所有数据,降迁过程结束;When the value of part of the data stored in a higher-level storage device is lower than the value of part of the data stored in a lower-level storage device, the migration condition is triggered and the downgrade operation is performed according to the downgrade rules until all the data in the higher-level storage device is The data value is higher than all the data in the lower-level storage device, and the downgrade process ends;

所述被迫降迁包括以下内容:当存储设备容量占整个存储空间的容量比例高于设定的高水位阈值时,触发迁移条件,该存储设备按照降迁规则进行数据迁移直至存储容量低于高水位阀值。The forced downgrade includes the following: when the proportion of the storage device capacity in the entire storage space is higher than the set high water level threshold, the migration condition is triggered, and the storage device performs data migration according to the downgrade rules until the storage capacity is lower than the high water level threshold. water level threshold.

优选的,所述降迁规则具体包括:Preferably, the downgrade rules specifically include:

采用数据价值评估方法对待迁移存储设备中所有的存储数据进行价值评估,并按数据价值从低到高的顺序进行数据迁移,将价值较低的数据迁移到较低级别存储设备中。Use the data value assessment method to evaluate the value of all the stored data in the storage device to be migrated, and migrate the data in order from low to high data value, migrating data with lower value to lower-level storage devices.

优选的,所述升迁过程包括主动升迁,具体包括以下步骤:Preferably, the promotion process includes active promotion, specifically including the following steps:

当较低级别存储设备中的部分存储数据的价值高于较高级别存储设备中部分存储数据的价值时,触发迁移条件,按照升迁规则进行升迁操作,直至较低级别存储设备中所有数据价值均低于较高级别存储设备中的所有数据,升迁过程结束。When the value of part of the data stored in the lower-level storage device is higher than the value of part of the data stored in the higher-level storage device, the migration condition is triggered and the upgrading operation is performed according to the upsizing rules until the value of all data in the lower-level storage device is equal to that of the data stored in the lower-level storage device. With all data in the storage device below the higher level, the upsizing process ends.

优选的,所述升迁规则具体为:采用数据价值评估方法对较低级别存储设备中所有的存储数据进行价值评估,并按数据价值从高到低的顺序进行升迁操作,将价值较高的数据迁移到较高级别存储设备中。Preferably, the upgrading rules are specifically: use a data value evaluation method to evaluate the value of all stored data in lower-level storage devices, and perform upgrading operations in the order of data value from high to low, and move data with higher value to Migrate to a higher level storage device.

优选的,文件访问时间T的计算公式如公式2所示:Preferably, the calculation formula of the file access time T is as shown in Formula 2:

其中,Ti指的是某一用户访问(包括修改)该文件的时间总和,N表示用户数量。Among them, Ti refers to the total time for a certain user to access (including modify) the file, and N represents the number of users.

数据读写频率F的计算公式如公式(3)所示:The calculation formula of data reading and writing frequency F is as shown in formula (3):

其中,Ri,Wi分别表示某一用户在对该文件读、写频率,用访问(包括修改)间隔时间的倒数表示;N表示用户数量。Among them, Ri and Wi respectively represent the frequency of reading and writing of the file by a certain user, expressed by the reciprocal of the access (including modification) interval time; N represents the number of users.

文件的访问用户数量C采用公式(4)计算:The number of users accessing the file C is calculated using formula (4):

其中,Ui表示某一用户访问该文件的记录;N表示用户数量。Among them, Ui represents the record of a certain user accessing the file; N represents the number of users.

所述数据关联度R指的是数据文件之间的关联关系包括但不限于:专题关联度、时间版本关联度、成果形式关联度、因果关系关联度;The data correlation R refers to the correlation between data files, including but not limited to: topic correlation, time version correlation, achievement form correlation, and causal relationship correlation;

所述存储成本以及迁移成本评估Q包括但不限于:对当前或今后存储设备的投入核算、存储迁移的人力成本、对业务系统的影响和数据迁移风险成本。The storage cost and migration cost assessment Q includes but is not limited to: accounting for investment in current or future storage equipment, labor costs of storage migration, impact on business systems, and data migration risk costs.

本发明的有益效果是:The beneficial effects of the present invention are:

本发明提供一种数据分级存储迁移流程方法,通过建立一种海量数据分级存储的标准、规范的流程管理方法,尤其是基于用户需求的数据升迁过程,解决海量数据在线、近线和离线分级存储管理中的主观性、随意性等不合理因素,实现海量数据全生命周期内在线、近线和离线存储资源中的高效组织与动态存储管理,提升数据安全保障和服务效率。The present invention provides a data hierarchical storage migration process method, which solves the problem of online, near-line and offline hierarchical storage of massive data by establishing a standard and standardized process management method for hierarchical storage of massive data, especially a data upgrading process based on user needs. Unreasonable factors such as subjectivity and randomness in management can be eliminated to achieve efficient organization and dynamic storage management of online, near-line and offline storage resources in the entire life cycle of massive data, and improve data security and service efficiency.

附图说明Description of the drawings

图1是实施例中提供的数据分级存储迁移流程方法原理示意图;Figure 1 is a schematic diagram of the principle of the data hierarchical storage migration process method provided in the embodiment;

图2是基于用户需求的数据存储升级迁移流程管理图Figure 2 is a management diagram of the data storage upgrade and migration process based on user needs.

图3是数据存储升级迁移流程管理图;Figure 3 is a management diagram of the data storage upgrade and migration process;

图4是数据存储降级迁移流程管理图;;Figure 4 is a management diagram of the data storage downgrade and migration process;;

图5是基于存储空间的高低水位法示意图;Figure 5 is a schematic diagram of the high and low water level method based on storage space;

图6是各类成果数据的存储顺序图;Figure 6 is a storage sequence diagram of various types of result data;

图7是数据分级存储的位置监控示意图;Figure 7 is a schematic diagram of location monitoring for hierarchical data storage;

图8是各级存储数据的综合应用示意图;Figure 8 is a comprehensive application diagram of storage data at all levels;

图9是数据分级存储的迁移过程监控示意图。Figure 9 is a schematic diagram of monitoring the migration process of data hierarchical storage.

具体实施方式Detailed ways

为了使本发明的目的、技术方案及优点更加清楚明白,以下结合附图,对本发明进行进一步详细说明。应当理解,此处所描述的具体实施方式仅仅用以解释本发明,并不用于限定本发明。In order to make the purpose, technical solutions and advantages of the present invention more clear, the present invention will be further described in detail below with reference to the accompanying drawings. It should be understood that the specific embodiments described here are only used to explain the present invention and are not intended to limit the present invention.

实施例Example

本实施例提供了一种数据分级存储迁移流程方法,以数据中心的在线存储设备、近线存储设备以及离线存储设备为对象,实现多级存储设备中的数据在各级存储设备之间的动态管理;具体包括数据升迁过程和数据降迁过程,如图1所示,所述数据升迁过程指的是低级存储设备中的数据按照升迁规则迁移到更高级别的存储设备中,包括离线存储(三级)升到近线存储(二级);近线存储(二级)升到在线存储(一级);离线存储(三级)升到在线存储(一级)。所述数据降迁过程指的是高级存储设备中的数据按照降迁规则迁移到较低级存储设备中,在线存储(一级)降到近线存储(二级);近线存储(二级)降到离线存储(三级);在线存储(一级)降到离线存储(三级)。This embodiment provides a data hierarchical storage migration process method, which targets the online storage devices, near-line storage devices and offline storage devices of the data center to realize the dynamic migration of data in multi-level storage devices between storage devices at all levels. Management; specifically includes the data upgrading process and the data downgrading process, as shown in Figure 1. The data upgrading process refers to the migration of data in low-level storage devices to higher-level storage devices according to the upgrading rules, including offline storage ( Level 3) is upgraded to near-line storage (Level 2); near-line storage (Level 2) is upgraded to online storage (Level 1); offline storage (Level 3) is upgraded to online storage (Level 1). The data migration process refers to the migration of data in high-level storage devices to lower-level storage devices according to the downgrade rules. Online storage (level one) is downgraded to near-line storage (level two); near-line storage (level two) ) is downgraded to offline storage (Level 3); online storage (Level 1) is downgraded to offline storage (Level 3).

基于用户需求的升迁过程,在特定时间和特殊情况下,用户对某一级别存储设备上的数据加强利用和访问,该部分数据价值凸显,为了提升服务效率,需要对该部分数据进行升迁操作。对用户请求的这部分数据,还要针对不同的管理和存储情况开展数据的存储升迁,迁移流程管理如图2所示,具体包括以下步骤:Based on the upgrading process based on user needs, at specific times and under special circumstances, users will strengthen the use and access of data on a certain level of storage devices. The value of this part of the data is highlighted. In order to improve service efficiency, this part of the data needs to be upgraded. For this part of the data requested by users, data storage upgrade must be carried out according to different management and storage situations. The migration process management is shown in Figure 2, which specifically includes the following steps:

S1,对于用户访问数据X,确认数据X的具体位置,若位于最高级存储设备中,则不做迁移,直接进入步骤S5;若位于降迁队列中,则进入步骤S2,若位于较低级存储设备中,则进入步骤S3;若位于升迁队列中,则进入步骤S4;S1, for user access to data X, confirm the specific location of data If it is in the storage device, go to step S3; if it is in the promotion queue, go to step S4;

S2,确认数据X降迁过程是否已完成,若已完成,则进入步骤S3;若未完成迁移,则停止迁移并移出降迁队列,进入步骤S5,将数据X保留在高级存储设备中;S2, confirm whether the downgrading process of data

S3,重新采用价值评估方法对数据X进行评估,调整较低级存储设备中的数据价值排序,将数据X的置于升迁队列前部进行数据升迁过程,待数据升迁过程结束,数据X进入高级存储设备中,进入步骤S5;S3, re-use the value evaluation method to evaluate data X, adjust the data value sorting in lower-level storage devices, place data In the storage device, enter step S5;

S4,确认数据X是否已完成数据升迁过程,若未迁移,则将数据X置于升迁队列前部,继续迁移过程;若处于迁移中,则加快进行升迁过程,直至迁移完成后进入步骤S5;若已迁移完,则直接进入步骤S5;S4, confirm whether data X has completed the data upgrading process. If it has not been migrated, place data If the migration has been completed, proceed directly to step S5;

S5,数据X保留在高级存储设备中。S5, data X remains in premium storage device.

对于数据降迁过程,如图3所示,根据降迁条件可包括主动降迁和被迫降迁,其中所述主动降迁的条件为数据应用级别降低,数据不符合所在存储级别的数据价值标准,具体包括以下步骤:For the data downgrade process, as shown in Figure 3, it can include active downgrade and forced downgrade according to the downgrade conditions. The condition for active downgrade is that the data application level is lowered and the data does not meet the data value standard of the storage level. , specifically including the following steps:

当较高级别的存储设备中的部分存储数据的价值低于较低级别存储设备中部分存储数据的价值时,触发迁移条件,按照降迁规则进行降迁操作,直至较高级别存储设备中所有数据价值均高于较低级别存储设备中的所有数据,降迁过程结束;采用数据价值评估方法对待迁移存储设备中所有的存储数据进行价值评估,并按数据价值从低到高的顺序进行数据迁移,将价值较低的数据迁移到较低级别存储设备中。When the value of part of the data stored in a higher-level storage device is lower than the value of part of the data stored in a lower-level storage device, the migration condition is triggered and the downgrade operation is performed according to the downgrade rules until all the data in the higher-level storage device is The data value is higher than all the data in the lower-level storage device, and the downgrade process is over; use the data value evaluation method to evaluate the value of all the storage data in the storage device to be migrated, and conduct data evaluation in order from low to high data value. Migration, migrating less valuable data to lower-level storage devices.

所述被迫降迁的条件为存储设备上存储空间已满或者将满,数据被迫迁移,具体包括以下内容:按照高低水位法(如图5所示),当存储设备容量占整个存储空间的容量比例高于设定的高水位阈值时,触发迁移条件,该存储设备按照降迁规则进行数据迁移直至存储容量低于高水位阀值。The conditions for the forced downgrade are that the storage space on the storage device is full or nearly full, and the data is forced to be migrated. Specifically, the conditions include the following: According to the high and low water level method (as shown in Figure 5), when the storage device capacity occupies 10% of the entire storage space, When the capacity ratio is higher than the set high water level threshold, the migration condition is triggered, and the storage device performs data migration according to the downgrade rules until the storage capacity is lower than the high water level threshold.

对于升迁过程而言,包括两个升迁迁移条件,由于数据应用级别提升,即数据价值超过当前所在存储级别的数据标准的主动升迁和基于对该数据的访问请求激活数据实现的升迁过程:For the upgrading process, there are two upgrading migration conditions: due to the improvement of the data application level, that is, the active upgrading of data value exceeding the data standard of the current storage level and the upgrading process of activating data implementation based on access requests to the data:

其中,主动升迁过程如图4所示,具体包括:当较低级别存储设备中的部分存储数据的价值高于较高级别存储设备中部分存储数据的价值时,触发迁移条件,按照升迁规则进行升迁操作,直至较低级别存储设备中所有数据价值均低于较高级别存储设备中的所有数据,升迁过程结束。Among them, the active upgrading process is shown in Figure 4, which specifically includes: when the value of part of the stored data in the lower-level storage device is higher than the value of part of the stored data in the higher-level storage device, the migration conditions are triggered and the upgrading rules are followed. The upsizing operation is completed until the value of all data in the lower-level storage device is lower than that of all data in the higher-level storage device.

采用的所述升迁规则具体为:采用数据价值评估方法对较低级别存储设备中所有的存储数据进行价值评估,并按数据价值从高到低的顺序进行升迁操作,将价值较高的数据迁移到较高级别存储设备中。The specific upgrading rules adopted are as follows: Use the data value evaluation method to evaluate the value of all the stored data in lower-level storage devices, and perform upgrading operations in order from high to low data value, migrating data with higher value. to a higher level storage device.

本实施例中所采用的所述数据价值评估方法具体包括通过公式(1)进行价值评定:The data value assessment method used in this embodiment specifically includes value assessment through formula (1):

所述数据价值评估方法具体包括通过公式(1)进行价值评定:The data value assessment method specifically includes value assessment through formula (1):

其中,IP(X)表示数据文件X的固有属性,S表示文件大小,T表示文件访问时间,F为数据读写频率,C为文件的访问用户数量,R为数据关联度,Q为存储成本以及迁移成本。Among them, IP(X) represents the inherent attributes of the data file and migration costs.

文件访问时间T的计算公式如公式(2)所示:The calculation formula of file access time T is as shown in formula (2):

其中,Ti指的是某一用户访问(包括修改)该文件的时间总和,N表示用户数量。Among them, Ti refers to the total time for a certain user to access (including modify) the file, and N represents the number of users.

数据读写频率F的计算公式如公式(3)所示:The calculation formula of data reading and writing frequency F is as shown in formula (3):

其中,Ri,Wi分别表示某一用户在对该文件读、写频率,用访问(包括修改)间隔时间的倒数表示;N表示用户数量。Among them, Ri and Wi respectively represent the frequency of reading and writing of the file by a certain user, expressed by the reciprocal of the access (including modification) interval time; N represents the number of users.

文件的访问用户数量C采用公式(4)计算:The number of users accessing the file C is calculated using formula (4):

其中,Ui表示某一用户访问该文件的记录;N表示用户数量。Among them, Ui represents the record of a certain user accessing the file; N represents the number of users.

数据关联度R和存储成本以及迁移成本评估Q,其中数据关联度R是数据之间的关联关系,包含:专题关联度、时间版本关联度、成果形式关联度、因果关系关联度等主管或客观的联系。存储成本以及迁移成本评估Q则是对当前或今后存储设备的投入核算、存储迁移的人力成本、对业务系统的影响、数据迁移风险成本等。Data relevance R and storage cost and migration cost assessment Q, where data relevance R is the relationship between data, including: topic relevance, time version relevance, result form relevance, causality relevance, etc. Supervisor or objective contact. Storage cost and migration cost assessment Q is the calculation of current or future storage equipment investment, labor cost of storage migration, impact on business systems, data migration risk cost, etc.

本实施例以自然资源部数据中心的各类数据动态存储管理为例,并以第二次全国土地调查成果数据(包含数据种类较多)存储管理为典型应用,来阐述本流程方法在自然资源部数据中心各类数据管理中的作用和效果。This embodiment takes the dynamic storage management of various types of data in the data center of the Ministry of Natural Resources as an example, and uses the storage management of the second national land survey results data (including many types of data) as a typical application to illustrate the application of this process method in natural resources. The role and effect of various types of data management in the Ministry of Data Center.

1、现有分级存储设备存储容量估算和水位阀值设定1. Storage capacity estimation of existing hierarchical storage equipment and water level threshold setting

目前,自然资源部数据中心的在线、近线和离线的存储容量分别达到了120TB、300TB、600TB,考虑到数据中心数据管理与服务的现状,将上述三级存储的高、中、底的存储水位区阀值均定义为85%、50%、15%,即:存储容量超过85%时,强制迁移部分数据,不再增加存储新的数据。At present, the online, near-line and offline storage capacities of the Ministry of Natural Resources data center have reached 120TB, 300TB and 600TB respectively. Considering the current situation of data management and services in the data center, the high, medium and low-level storage of the above three-level storage The water level zone thresholds are defined as 85%, 50%, and 15%. That is, when the storage capacity exceeds 85%, some data will be forcibly migrated and no new data will be stored.

经估算,当前在线、近线和离线的可用存储容量分别约为:42TB、108TB、320TB,都存在一定的存储余量。It is estimated that the current available online, near-line and offline storage capacities are approximately 42TB, 108TB and 320TB respectively, and there is a certain storage margin.

2、开展数据价值评估和分级存储2. Carry out data value assessment and hierarchical storage

以第二次全国土地调查成果数据接收和存储管理为例,开展数据存储分级管理。第二次全国土地调查成果数据从数据类型来分,包含:栅格影像数据、空间矢量数据、文档成果等;从成果的处理阶段划分,包含:原始成果、中间成果和最终成果。从访问频率和应用价值来看,最终成果数据最大,其次为中间成果,原始成果最小;从数据量大小来划分,栅格影像数据量最大,其次为空间矢量数据,文档成果数据量最小。Taking the data reception and storage management of the second national land survey results as an example, we carry out hierarchical data storage management. The results of the Second National Land Survey are divided by data type, including: raster image data, spatial vector data, document results, etc.; by the processing stage of the results, they include: original results, intermediate results and final results. From the perspective of access frequency and application value, the final result data is the largest, followed by intermediate results, and the original result is the smallest. In terms of data size, raster image data is the largest, followed by space vector data, and document results are the smallest.

在基于第1阶段现有存储容量分析的基础上,对第二次全国土地调查成果数据的各种成果数据进行价值估算,评估内容包含:可提供服务的自然资源监管业务用户数和访问频率,结合各类数据的自身特点(数据量、更新周期),套用基于信息生命周期的数据价值评估法进行定性或定量分析,确定第二次全国土地调查成果数据的分级存储顺序由高到低依次为:最终成果、中间成果、原始成果,而每一类成果中分级存储顺序由高到低依次为文档成果、空间矢量成果、影像成果。上述各类成果数据的应用价值和分级存储顺序如图6所示。Based on the analysis of the existing storage capacity in the first stage, the value of various result data of the second national land survey data is estimated. The evaluation content includes: the number of natural resource supervision business users that can provide services and the frequency of access, Combined with the characteristics of various types of data (data volume, update cycle), the data value assessment method based on the information life cycle is used for qualitative or quantitative analysis, and the hierarchical storage order of the second national land survey results data is determined from high to low as follows : final results, intermediate results, original results, and the hierarchical storage order of each type of results from high to low is document results, space vector results, and image results. The application value and hierarchical storage sequence of the above types of achievement data are shown in Figure 6.

3、开展数据分级存储的实施3. Carry out the implementation of hierarchical data storage

根据以上确立的存储分级优先顺序,结合数据中心现有的各级存储的存储容量估算,将最终成果数据中的5G文档数据、200G空间矢量数据、2TB的栅格影像数据一并存入在线存储中,使得在线存储的数据总量为80.205T,约占总存储容量的66.84%,没有超过85%。Based on the storage hierarchical priorities established above and combined with the existing storage capacity estimates of all levels of storage in the data center, the 5G document data, 200G spatial vector data, and 2TB raster image data in the final result data will be stored in online storage. , so that the total amount of data stored online is 80.205T, accounting for approximately 66.84% of the total storage capacity, not exceeding 85%.

中间成果数据中的3G文档数据、100GB空间矢量数据、20TB的栅格影像数据一并存入近线存储设备中,使得近线存储的数据总量为212.103T,约占总存储容量的70.70%,没有超过85%。The 3G document data, 100GB spatial vector data, and 20TB raster image data in the intermediate result data are all stored in the near-line storage device, making the total amount of near-line storage data 212.103T, accounting for approximately 70.70% of the total storage capacity. , no more than 85%.

原始成果数据主要包括部分影像数据,数据量约为18TB,考虑到无法直接提供数据服务,将其存入离线存储设备,使得离线存储的数据总量为298T,约占总存储容量的49.67%,没有超过85%。The original result data mainly includes some image data, with a data volume of about 18TB. Considering that data services cannot be directly provided, it is stored in an offline storage device, making the total amount of offline stored data 298T, accounting for approximately 49.67% of the total storage capacity. No more than 85%.

图7、图8、图9为分级存储实施后效果图。Figures 7, 8, and 9 are renderings after the implementation of hierarchical storage.

通过采用本发明公开的上述技术方案,得到了如下有益的效果:By adopting the above technical solutions disclosed in the present invention, the following beneficial effects are obtained:

本发明依据存储设备本身和自然资源大数据特性,提出了自然资源大数据存储迁移的标准流程和方法,规范了自然资源大数据的动态存储管理方法,解决了以下几个方面的问题:Based on the characteristics of the storage device itself and natural resource big data, this invention proposes a standard process and method for storage and migration of natural resource big data, standardizes the dynamic storage management method of natural resource big data, and solves the following problems:

一,解决了自然资源数据存储动态管理的规范化、标准化问题。本方法基于存储成本核算和数据价值评估方法,建立了自然资源数据全生命周期存储管理的标准化流程。对不同生命周期阶段的数据动态存储管理提供了一个规范、标准的流程方法。避免了数据在存储管理过程中存在的主观性和随意性,减少了数据存储管理中存在的风险。First, it solves the standardization and standardization issues of dynamic management of natural resource data storage. This method is based on storage cost accounting and data value assessment methods, and establishes a standardized process for natural resource data full life cycle storage management. It provides a normative and standard process method for dynamic storage management of data at different life cycle stages. It avoids the subjectivity and randomness in the data storage management process and reduces the risks in data storage management.

二,节约了存储成本,提高了存储设备的利用效率。本流程方法规范了不同生命周期阶段数据的存储管理方法,实现了对存储资源的合理动态调配,最大程度发挥了不同类别存储设备的特性,节约了存储成本,提高了存储设备的利用效能,实现了对存储资源的高效利用。Second, it saves storage costs and improves the utilization efficiency of storage equipment. This process method standardizes the storage management method of data in different life cycle stages, achieves reasonable and dynamic allocation of storage resources, maximizes the characteristics of different types of storage devices, saves storage costs, improves the utilization efficiency of storage devices, and achieves Efficient utilization of storage resources.

三,提升了自然资源各类数据的管理与服务效率。本流程方法能够根据自然资源各类数据在不同生命周期阶段的应用特点,通过提供标准、规范的流程方法,实现不同类别数据在不同类别存储上的动态管理,保障了不同类型数据的服务效率,提升了自然资源各类数据的管理水平和服务效率。Third, it improves the management and service efficiency of various types of natural resource data. This process method can realize the dynamic management of different types of data on different types of storage by providing standard and standardized process methods based on the application characteristics of various types of natural resource data at different life cycle stages, ensuring the service efficiency of different types of data. Improved the management level and service efficiency of various types of natural resource data.

以上所述仅是本发明的优选实施方式,应当指出,对于本技术领域的普通技术人员来说,在不脱离本发明原理的前提下,还可以做出若干改进和润饰,这些改进和润饰也应视本发明的保护范围。The above are only preferred embodiments of the present invention. It should be noted that those skilled in the art can make several improvements and modifications without departing from the principles of the present invention. These improvements and modifications can also be made. The scope of protection of the present invention should be considered.

Claims (6)

1. The data hierarchical storage migration flow method is characterized in that a data storage migration process is carried out based on multi-stage storage equipment, so that dynamic management of data in the multi-stage storage equipment among all stages of storage equipment is realized; the storage classification of the multi-level storage device refers to storage categories which are divided into a first-level storage device, a second-level storage device, … and an Mth-level storage device from high to low according to different access modes, access performances and device safety of the device; the storage migration process comprises a data migration process and a data migration process, wherein the data migration process refers to that data in the low-level storage equipment is migrated to the storage equipment at a higher level according to a migration rule, and the data migration process refers to that the data in the high-level storage equipment is migrated to the storage equipment at a lower level according to the migration rule;
the data migration process comprises a migration process based on user requirements, and specifically comprises the following steps:
s1, for user access data X, confirming the specific position of the data X, if the data X is located in the highest-level storage device, not migrating, and directly entering step S5; if the storage device is positioned in the descending queue, the step S2 is performed, and if the storage device is positioned in the lower-level storage device, the step S3 is performed; if the queue is in the transition queue, the step S4 is entered;
s2, confirming whether the data X descent process is completed or not, and if so, entering a step S3; if the migration is not completed, stopping the migration and moving out of the migration queue, and entering step S5, wherein the data X is reserved in the advanced storage equipment;
s3, evaluating the data X by adopting a value evaluation method again, adjusting the data value sequence in the lower-level storage equipment, placing the data X in the front part of the transition queue to perform a data transition process, entering the higher-level storage equipment after the data transition process is finished, and entering a step S5;
s4, confirming whether the data X completes the data migration process, if not, placing the data X at the front part of a migration queue, and continuing the migration process; if the transition is in the process of transition, accelerating the process of transition until the transition is completed, and entering step S5; if the migration is completed, directly entering step S5;
s5, the data X is reserved in the advanced storage device;
the data value evaluation method specifically comprises the following steps of evaluating the value through a formula (1):
wherein, IP (X) represents the inherent attribute of the data file X, S represents the file size, T represents the file access time, F is the data read-write frequency, C is the number of access users of the file, R is the data association degree, and Q is the storage cost and migration cost evaluation.
2. The data hierarchical storage migration flow method of claim 1, wherein the data migration process comprises an active descent and a forced descent, wherein the active descent comprises the steps of:
when the value of part of the stored data in the higher-level storage device is lower than that of part of the stored data in the lower-level storage device, triggering migration conditions, and performing migration operation according to a migration rule until all the data values in the higher-level storage device are higher than all the data in the lower-level storage device, and ending the migration process;
the forced drop comprises the following: when the capacity proportion of the capacity of the storage device to the whole storage space is higher than a set high water level threshold value, triggering a migration condition, and performing data migration by the storage device according to a migration rule until the storage capacity is lower than the high water level threshold value.
3. The method of data hierarchical storage migration flow according to claim 2, wherein the migration rule specifically includes:
and performing value evaluation on all the stored data in the storage equipment to be migrated by adopting a data value evaluation method, performing data migration according to the sequence of low data value to high data value, and migrating the data with lower value to the storage equipment with lower level.
4. The method of data hierarchical storage migration process according to claim 1, wherein the data migration process further comprises active migration, specifically comprising the steps of:
when the value of part of the stored data in the lower-level storage device is higher than that of part of the stored data in the higher-level storage device, triggering migration conditions, and performing migration operation according to an active migration rule until all the data values in the lower-level storage device are lower than all the data in the higher-level storage device, and ending the migration process.
5. The method of data hierarchical storage migration flow according to claim 4, wherein the active migration rule specifically comprises: and carrying out value evaluation on all stored data in the lower-level storage equipment by adopting a data value evaluation method, carrying out migration operation according to the sequence of the data value from high to low, and migrating the data with higher value into the higher-level storage equipment.
6. The data hierarchical storage migration flow method according to claim 1, wherein in the data value evaluation formula, a calculation formula of the file access time T is as shown in formula (2):
where Ti refers to the sum of the times a certain user accesses (including modifies) the file, and N represents the number of users;
the calculation formula of the data read-write frequency F is shown in formula (3):
wherein, ri, wi respectively represent the frequency of reading and writing the file by a certain user, and are represented by the reciprocal of the access (including modification) interval time; n represents the number of users;
the number of access users C of the file is calculated by adopting a formula (4):
where Ui represents a record of a certain user accessing the file; n represents the number of users;
the data association degree R refers to association relations between data files, including but not limited to: thematic relevance, time version relevance, achievement form relevance and causal relation relevance;
the storage cost and migration cost assessment Q includes, but is not limited to: investment accounting for current or future storage devices, labor costs for storage migration, impact on business systems, and data migration risk costs.
CN202311232833.6A 2023-02-27 2023-09-22 A data hierarchical storage migration process method Pending CN117193656A (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN2023102060538 2023-02-27
CN202310206053 2023-02-27

Publications (1)

Publication Number Publication Date
CN117193656A true CN117193656A (en) 2023-12-08

Family

ID=88999631

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311232833.6A Pending CN117193656A (en) 2023-02-27 2023-09-22 A data hierarchical storage migration process method

Country Status (1)

Country Link
CN (1) CN117193656A (en)

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090254572A1 (en) * 2007-01-05 2009-10-08 Redlich Ron M Digital information infrastructure and method
CN102291450A (en) * 2011-08-08 2011-12-21 浪潮电子信息产业股份有限公司 Data online hierarchical storage method in cluster storage system
CN102508789A (en) * 2011-10-14 2012-06-20 浪潮电子信息产业股份有限公司 Grading storage method for system
CN104462240A (en) * 2014-11-18 2015-03-25 浪潮(北京)电子信息产业有限公司 Method and system for realizing hierarchical storage and management in cloud storage
CN104598495A (en) * 2013-10-31 2015-05-06 南京中兴新软件有限责任公司 Hierarchical storage method and system based on distributed file system
CN105653591A (en) * 2015-12-22 2016-06-08 浙江中控研究院有限公司 Hierarchical storage and migration method of industrial real-time data
CN106355031A (en) * 2016-09-21 2017-01-25 大连大学 Data value degree calculation method based on analytic hierarchy process
CN111427843A (en) * 2020-04-15 2020-07-17 成都信息工程大学 File-oriented mass data hierarchical storage method
CN112948398A (en) * 2021-04-29 2021-06-11 电子科技大学 Hierarchical storage system and method for cold and hot data

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090254572A1 (en) * 2007-01-05 2009-10-08 Redlich Ron M Digital information infrastructure and method
CN102291450A (en) * 2011-08-08 2011-12-21 浪潮电子信息产业股份有限公司 Data online hierarchical storage method in cluster storage system
CN102508789A (en) * 2011-10-14 2012-06-20 浪潮电子信息产业股份有限公司 Grading storage method for system
CN104598495A (en) * 2013-10-31 2015-05-06 南京中兴新软件有限责任公司 Hierarchical storage method and system based on distributed file system
CN104462240A (en) * 2014-11-18 2015-03-25 浪潮(北京)电子信息产业有限公司 Method and system for realizing hierarchical storage and management in cloud storage
CN105653591A (en) * 2015-12-22 2016-06-08 浙江中控研究院有限公司 Hierarchical storage and migration method of industrial real-time data
CN106355031A (en) * 2016-09-21 2017-01-25 大连大学 Data value degree calculation method based on analytic hierarchy process
CN111427843A (en) * 2020-04-15 2020-07-17 成都信息工程大学 File-oriented mass data hierarchical storage method
CN112948398A (en) * 2021-04-29 2021-06-11 电子科技大学 Hierarchical storage system and method for cold and hot data

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
吴洪桥: "多源影像数据分级存储与数据迁移规则研究", 国土资源信息化, 30 June 2014 (2014-06-30), pages 21 - 26 *
王艳云: "分级存储系统中数据迁移技术的研究", 《中国优秀硕士学位论文全文数据库信息科技辑》, 15 February 2013 (2013-02-15), pages 5 - 56 *

Similar Documents

Publication Publication Date Title
US8856483B1 (en) Virtual data storage service with sparse provisioning
CN104301404B (en) A kind of method and device of the adjustment operation system resource based on virtual machine
US20120166517A1 (en) Intelligence Virtualization System and Method to support Social Media Cloud Service
CN110378575B (en) Overdue event refund collection method and device and computer readable storage medium
CN115033340B (en) A host selection method and related device
CN111104226B (en) An intelligent management system and method for multi-tenant service resources
CN101989999A (en) Hierarchical storage system in distributed environment
CN106484330A (en) A kind of hybrid magnetic disc individual-layer data optimization method and device
CN111489166A (en) Risk prevention and control method, device, processing equipment and system
CN104484282A (en) Internal storage recovery method and device
CN105357251A (en) Resource pool management system and management method
CN107070709A (en) A kind of NFV implementation methods based on bottom NUMA aware
US20220342704A1 (en) Automatic placement decisions for running incoming workloads on a datacenter infrastructure
CN106650501A (en) Database access control method and apparatus
CN119668813A (en) A method, device, equipment, medium and product for processing batch interruption tasks
CN117193656A (en) A data hierarchical storage migration process method
CN117908804B (en) File striping method, device, equipment and medium based on job perception
CN114546990A (en) Database parameter adjusting method and device
CN107066625A (en) A kind of oracle database table statistical information collection method and system
CN114968552B (en) Cache allocation method, device, equipment, storage medium and program product
CN115065685B (en) Cloud computing resource scheduling method, device, equipment and medium
CN105468494A (en) I/O intensive application identification method
CN113806050B (en) A method, device, electronic device and storage medium for processing computing resources
CN112000634B (en) Capacity management method, system, equipment and storage medium of NAS storage file system
CN115994029A (en) Container resource scheduling method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20231208

RJ01 Rejection of invention patent application after publication