[go: up one dir, main page]

CN106484712A - The date storage method of distributed file system and device - Google Patents

The date storage method of distributed file system and device Download PDF

Info

Publication number
CN106484712A
CN106484712A CN201510532946.7A CN201510532946A CN106484712A CN 106484712 A CN106484712 A CN 106484712A CN 201510532946 A CN201510532946 A CN 201510532946A CN 106484712 A CN106484712 A CN 106484712A
Authority
CN
China
Prior art keywords
vsa
data
type
storage
application
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201510532946.7A
Other languages
Chinese (zh)
Inventor
黄文龙
车皓阳
杜涛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
BEIJING YICHE INTERNET INFORMATION TECHNOLOGY Co Ltd
Original Assignee
BEIJING YICHE INTERNET INFORMATION TECHNOLOGY Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by BEIJING YICHE INTERNET INFORMATION TECHNOLOGY Co Ltd filed Critical BEIJING YICHE INTERNET INFORMATION TECHNOLOGY Co Ltd
Priority to CN201510532946.7A priority Critical patent/CN106484712A/en
Publication of CN106484712A publication Critical patent/CN106484712A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • G06F16/1737Details of further file system functions for reducing power consumption or coping with limited storage space, e.g. in mobile devices
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/182Distributed file systems
    • G06F16/1824Distributed file systems implemented using Network-attached Storage [NAS] architecture
    • G06F16/183Provision of network file services by network file servers, e.g. by using NFS, CIFS

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

Embodiments provide date storage method and the device of distributed file system, methods described includes:After virtual storage region VSA management module receives the storage request of application program, determine the application type of described application program;Described VSA management module and then determine the corresponding VSA of described application type;Described VSA management module, by the data of described application program, is stored according to the type of storage server of configuration, the setting area of storage server in described VSA.In the embodiment of the present invention, the data of different types of application program can be stored in different setting areas and the storage server of type, and the distributed file system in the embodiment of the present invention can meet the needs to data storage for all types of application programs.

Description

The date storage method of distributed file system and device
Technical field
The present invention relates to field of computer technology, specifically, the present invention relates to a kind of distributed document The date storage method of system and device.
Background technology
With the arrival in big data epoch, the process of big data is more and more important.Hadoop is one By the distributed system architecture of Apache fund club exploitation, Hadoop system is at present more Popular distributed processing system(DPS), and it is increasingly becoming big data process de facto standards.Hadoop Have using simple, increase income, low cost the features such as.Architecturally, Hadoop have highly reliable, High extension, high flexible feature, support the data processing of PB level.
The initial design object of Hadoop system be by Map-Reduce (mapping-reduction) technology Lai Realize off-line data is rapidly processed.But rapid with the application program processing big data is sent out Exhibition, the different types of application program such as online data process, real time data processing and cold data process is opened Beginning occurs;Compared with processing with off-line data, at online data process, real time data processing and cold data Reason is all very different in terms of the access delay of data, calculating demand and cost sensitive degree, Hadoop system cannot tackle different types of application journey by single Map-Reduce technology Sequence.
At present, technical staff develops YARN (Yet Another Resource for Hadoop system Negotiator, another kind of resource coordination person) realizing different types of application program to cluster resource Distribution, scheduling and recovery, be various types of application programs reasonably Resources allocation, thus realizing Same Hadoop system supports polytype application program.For example, a kind of Hadoop system System structural framework schematic diagram is as shown in figure 1, Hadoop achieves traditional resource pipe by YARN Reason and the loose coupling of the batch application program based on Map-Reduce, support polytype application journey The scheduling of resource of sequence, thus support exploitation and the management of multiple application programs.
At present, Hadoop system still adopt HDFS (Hadoop Distributed File System, Distributed file system) data stored and is accessed.A kind of schematic diagram of the framework of HDFS is such as Shown in Fig. 2, this HDFS employs host-guest architecture, by a name node NameNode and some Individual back end DataNode composition.Name node represents that master server is used for managing file system NameSpace (such as title, path etc.) and client Client operate to the access of file;Data Node represents that storage server is used for managing the data of storage.
However, it was found by the inventors of the present invention that existing YARN only supports for different applications Program, the distribution CPU and Memory resources different with scheduling;And existing YARN is various Application program provides unified date storage method, including:Support the write-once of mass data, Repeatedly read;Do not support the renewal of written data, but allow to add new data in tail of file; Copy data for data configuration uniform amount;Do not differentiate between the difference of the hardware configuration between server, By the data random storage of different application in the server.
The date storage method of existing Hadoop system can not meet polytype application program Needs to data storage.For example, if the data storage of I/O intensive type application on site is taken common In business device, it is used for data storage because common server is generally only equipped with mechanical hard disk, and hard from machinery The speed that data read and write by disk is slower;And I/O intensive type application on site typically require frequent at short notice The data of ground read-write storage, the mechanical hard disk in common server will become speed bottle-neck, serious system The response speed of about I/O intensive type application on site.And if by all services in HDFS file system Device be all replaced with the server equipped with solid-state hard disk SSD although can meet I/O intensive type The needs to data storage for the line application, but with high costs;And, for traditional offline batch processing For application, common server can meet its data storage needs, and configuration has solid-state hard disk SSD Server be a kind of wasting of resources.
Therefore, it is necessary to provide a kind of date storage method of more flexible distributed file system and dress Put, to meet the needs to data storage for the polytype application program.
Content of the invention
The present invention is directed to the shortcoming of the data storage method of existing distributed file system, proposes a kind of The date storage method of distributed file system and device, exist and can not meet in order to solve prior art The problem of the needs to data storage for polytype application program.
The embodiment of the present invention is according on one side, there is provided a kind of data storage of distributed file system Method, including:
After virtual storage region VSA management module receives the storage request of application program, determine The application type of described application program;
Described VSA management module and then determine the corresponding VSA of described application type;
Described VSA management module by the data of described application program, according to configuration in described VSA The type of storage server, the setting area of storage server are stored.
The embodiment of the present invention, according on the other hand, additionally provides a kind of data of distributed file system Storage device, including:
Multiple virtual storage region VSA, each VSA correspond to different application types respectively;
VSA management module, after the storage for receiving application program is asked, determines described application The application type of program, and then determine the corresponding VSA of described application type;By described application program Data, according to the setting of the type of storage server of configuration, storage server in described VSA Region is stored.
In the embodiment of the present invention, the multiple virtual storage region VSA in distributed file system are respectively The different application type of correspondence;VSA management module in distributed file system is according to application program Application type, determine VSA corresponding with application type, by the data of application program, according to In VSA, the type of storage server of configuration and setting area are stored.As can be seen here, the present invention In embodiment, the data of different types of application program can be stored in different setting areas and type Storage server in, the distributed file system in the embodiment of the present invention can meet all types of should With the needs to data storage for the program.
The aspect that the present invention adds and advantage will be set forth in part in the description, and these will be from following Description in become obvious, or recognized by the practice of the present invention.
Brief description
The above-mentioned and/or additional aspect of the present invention and advantage are from retouching to embodiment below in conjunction with the accompanying drawings Will be apparent from stating with easy to understand, wherein:
Fig. 1 is the system structural framework schematic diagram of the Hadoop system of prior art;
Fig. 2 is a kind of configuration diagram of HDFS of prior art;
Fig. 3 a is the configuration diagram of the distributed file system of the embodiment of the present invention;
Fig. 3 b is the block schematic illustration of the internal structure of data storage device of the embodiment of the present invention;
Fig. 3 c is the corresponding pass between the application type of the embodiment of the present invention and the type of storage server The schematic diagram of one example of system;
Fig. 3 d is an example of the setting area of the storage server of each VSA of the embodiment of the present invention Schematic diagram;
Fig. 4 is the schematic flow sheet of the date storage method of the embodiment of the present invention;
Fig. 5 is the schematic flow sheet of the data access method of the embodiment of the present invention;
Fig. 6 is the block schematic illustration of the internal structure of VSA management module of the embodiment of the present invention.
Specific embodiment
Embodiments of the invention are described below in detail, the example of described embodiment is shown in the drawings, its In from start to finish same or similar label represent same or similar element or have same or like The element of function.Embodiment below with reference to Description of Drawings is exemplary, is only used for explaining this Invention, and be not construed as limiting the claims.
Those skilled in the art of the present technique are appreciated that unless expressly stated, singulative used herein " one ", " one ", " described " and " being somebody's turn to do " may also comprise plural form.It is to be further understood that Used in the description of the present invention, wording " inclusion " refers to there is described feature, integer, step, behaviour Make, element and/or assembly, but it is not excluded that there are or add other features one or more, whole Number, step, operation, element, assembly and/or their group.It should be understood that when we claim element It is " connected " or during " coupled " to another element, it can be directly connected or coupled to other elements, or Can also there is intermediary element in person.Additionally, " connection " used herein or " coupling " can include wirelessly Connect or wirelessly couple.Wording "and/or" used herein includes one or more associated listing The whole or any cell of item and whole combination.
Those skilled in the art of the present technique are appreciated that unless otherwise defined, all terms used herein (including technical term and scientific terminology), have and the those of ordinary skill in art of the present invention General understanding identical meaning.It should also be understood that those arts defined in such as general dictionary Language is it should be understood that have the meaning consistent with the meaning in the context of prior art, and removes Non- as here by specific definitions, otherwise will not be explained with idealization or excessively formal implication.
Those skilled in the art of the present technique be both appreciated that " terminal " used herein above, " terminal unit " Including the equipment of wireless signal receiver, it only possesses setting of the wireless signal receiver of non-emissive ability Standby, include receiving and launch the equipment of hardware, it has and can carry out on bidirectional communication link again The reception of two-way communication and the equipment of transmitting hardware.This equipment can include:Honeycomb or other communications Equipment, its have single line display or multi-line display or the honeycomb not having multi-line display or Other communication equipments;PCS (Personal Communications Service, PCS Personal Communications System), It can be with combine voice, data processing, fax and/or its communication ability;PDA(Personal Digital Assistant, personal digital assistant), it can include radio frequency receiver, pager, mutually The access of networking/Intranet, web browser, notepad, calendar and/or GPS (Global Positioning System, global positioning system) receptor;Conventional laptop and/or palmtop computer or other set Standby, its have and/or include the conventional laptop of radio frequency receiver and/or palmtop computer or other Equipment." terminal " used herein above, " terminal unit " can be portable, can transport, be arranged on In the vehicles (aviation, sea-freight and/or land), or it is suitable for and/or is configured to local Run, and/or with distribution form, operate in the earth and/or any other position in space is run.This In " terminal ", " terminal unit " that used can also be communication terminal, access terminals, music/video Playback terminal, for example, can be PDA, MID (Mobile Internet Device, mobile Internet Equipment) and/or there is mobile phone or intelligent television, the machine of music/video playing function The equipment such as top box.
The present inventor is it is considered that can increase in the distributed file system of the embodiment of the present invention If VSA management module and multiple VSA (Virtual Storage Area, virtual storage region), respectively VSA corresponds to different application types respectively.VSA management module can send according to application program Storage request, determines the application type of application program, and then determines the corresponding VSA of application type; This VSA management module by the data of application program, according to the class of the storage server of configuration in VSA Type, the equipment region of storage server are stored.As can be seen here, in the embodiment of the present invention, different The data of the application program of application type can be stored in different types of storage server, permissible Meet application program each needs to data storage of each application type.
And, in the embodiment of the present invention, also it is previously provided with respective storage respectively for each VSA Strategy.VSA management module can be according to the data trnascription number specified in storage strategy, to application journey Sequence data to be stored is replicated and is stored.It can be seen that, in the embodiment of the present invention, storage strategy is actual Upper corresponding with application type, according to the application data of different application type, copy corresponding number Purpose data trnascription is stored;With the data for each application program, according to same data trnascription Data is replicated and is stored and compared, and the embodiment of the present invention is meeting all types of application programs to data On the basis of the needs of backup, it is possible to reduce the total number of data trnascription, thus saving memory space, fall Low cost.
Specifically introduce the technical scheme of the embodiment of the present invention below in conjunction with the accompanying drawings.
In the embodiment of the present invention, the framework schematic diagram of the distributed file system based on HDFS is as schemed Shown in 3a, including:Primary server joint and multiple storage server node.
Wherein, each storage server node is all passed through the Internet and is connected with primary server joint.Main service Device node can be specifically Namenode node;Storage server node can be specifically DataNode Node.
Metadata management system in primary server joint is used for managing the NameSpace of file and adjusts visitor Family end accesses file, and the data storage managing each storage server node.This metadata management system Include:Data storage device.
The block schematic illustration of the internal structure of data storage device as shown in Figure 3 b, can include:VSA Management module 301 and multiple VSA302.
Wherein, VSA management module 301 is used for creating and safeguard each VSA302, is directed in advance VSA302 configuration corresponding strategies and interacted by YARN and each application program etc., are had Body function will subsequently describe in detail.
VSA302 includes:The mark of this VSA, the title of application program and application type, storage The type of server, and the setting area of storage server.Wherein, the setting area of storage server Domain can be the rack or server node set belonging to this storage server.VSAn in Fig. 1 Represent n-th VSA302, n is natural number.
In the embodiment of the present invention, before data storage device stores the data of application program, technology people Member can collect substantial amounts of application program in advance, and each application program is classified, and obtains each application The application type of program;Application type corresponding record by the mark of application program and this application program.
It is preferred that the application type of an application program one of can be specifically following application all types of: Offline batch application type, computation-intensive application on site type, cold data application type, input Output I/O intensive type application on site type, real-time application type.
Technical staff, for each application type, creates VSA302 corresponding with this application type, and Configure type and the setting area of storage server according to this application type for VSA302.Specifically, For each VSA302, by the title of the application program corresponding to the mark of this VSA, this VSA With application type and according to this application type be this VSA distribution storage server type and Setting area corresponding record is in this VSA.
It is preferred that after creating multiple VSA302, by computation-intensive application on site type, criticizing offline Process application type, cold data application type, I/O intensive type application on site type and real-time application class Type, respectively with a VSA (VSA1), the 2nd VSA (VSA2), the 3rd VSA (VSA3)、 4th VSA (VSA4) and the 5th VSA (VSA5) corresponding.
And, the present inventor is it is considered that the application program of offline batch application type is usual Do not need to respond online;The application program of computation-intensive application on site is for processor (such as CPU) Disposal ability require higher, but by not needing continually to read and write to be stored in storage server Data;The application program of cold data application type, generally will after data storage, not visit again or very Few data accessing storage;It can be seen that, the application program of above-mentioned three kinds of application types, for data Read or write speed requires all relatively low.The application program of I/O intensive type application on site type, online big data During process, need continually to read and write the data being stored in storage server, need higher data Read or write speed.The application program of real-time application type, when real-time big data is processed, for data The reading and writing data that the application program of read or write speed, significantly larger than I/O intensive type application on site type needs Speed.
Therefore, it can the class of the storage server in a VSA, the 2nd VSA and the 3rd VSA Type is all configured to mechanical hard disk type of server;The type of the storage server in the 4th VSA is joined It is set to solid state hard disc or caching server type;The type of the storage server in the 5th VSA is joined It is set to Large Copacity inner server type.
For example, in the embodiment of the present invention, the application type of application program and the type of storage server it Between the schematic diagram of corresponding relation can be as shown in Figure 3 c.In Fig. 3 c, online should with computation-intensive Answer with the type of the storage server in the corresponding VSA of type and with offline batch processing Type with the storage server in the 2nd corresponding VSA of type is all mechanical hard disk server category Type;The type of the storage server in the 3rd VSA corresponding with cold data application type is machinery Hard disk server type;Storage in the 4th VSA corresponding with I/O intensive type application on site type The type of server is solid state hard disc or caching server type;Corresponding with real-time application type The type of the storage server in five VSA is Large Copacity inner server type.
It is preferred that for each VSA302, for the storage server setting area in each VSA.
For example, in Fig. 3 d, VSA1、VSA2、VSA3、VSA4And VSA5Respectively represent first, 2nd, three, four and five VSA;HDD, SSD and RAM represent respectively mechanical hard disk type of server, Solid state hard disc or caching server type and Large Copacity inner server type;Frame 1-7 represents respectively 7 different regions that storage server is located.VSA1And VSA2The storage server of middle configuration It is all disposed within frame 1,2 and 5;In VSA3, the storage server of configuration is all disposed within frame 3 In 4;In VSA4, the storage server of configuration is arranged in frame 6;In VSA5, configuration deposits Storage server is arranged in frame 7.
It is preferred that being packaged with multiple engine programs in VSA management module 301, each engine program is set The data storage data being respectively used to the different types of storage server of configuration in each VSA accesses Afterwards, by the type corresponding record of engine program and the storage server of configuration in VSA.
For example, in Fig. 3 d, engine 1 and VSA1And VSA2The mechanical hard disk server category of middle configuration Type is corresponding;Engine 2 and VSA3The mechanical hard disk type of server of middle configuration is corresponding;Engine 3 With VSA4The solid state hard disc of middle configuration or caching server type are corresponding;Engine 4 and VSA5In The Large Copacity inner server type of configuration is corresponding.
It is preferred that before data storage device stores the data of application program, technical staff is acceptable It is directed to each VSA302 in advance and storage strategy is set.Storage strategy includes data redundancy substrategy; Can include in data redundancy substrategy:Data trnascription number.
Specifically, technical staff generally can be to the important journey of the data of the application program of each application type Degree is classified;For each VSA302, generally can be according to the application type corresponding to this VSA Application data significance level rank it is intended that the data trnascription number of this VSA.It is preferred that For the higher data of significance level rank it is intended that larger data trnascription number;For important procedure The relatively low data of rank is it is intended that less data trnascription number or specified data trnascription number are zero.
More excellent, data redundancy substrategy can also include:The backup method of data trnascription.For example, For the data specifying larger data trnascription number, EC method can be configured each to this data Data trnascription is stored.Wherein, EC method is well known to those skilled in the art, herein no longer Repeat.
It is ageing that the present inventor is also noted that cold data generally has, and cold data is being stored This cold data will seldom be accessed after the long period in storage server, therefore can be by hard to machinery Under disk, electricity carries out energy-conservation, reduces the operation costs of the distributed file system in the embodiment of the present invention.
Therefore, technical staff is for the VSA302 corresponding with cold data application type, can also be For configuration section energy substrategy in the storage strategy of this VSA setting;This energy-conservation substrategy is specifically permissible It is by the strategy of electricity under the period or the strategy by electricity under label.
Further, for the VSA302 corresponding with cold data application type, can also be this VSA Configure the configuring area of independent storage server.
For example, for a VSA corresponding with computation-intensive application on site type and with from Storage in frame 1,2 and 5 can be taken by the 2nd corresponding VSA of line batch application type Business device is allocated to a VSA and the 2nd VSA;And for corresponding with cold data application type 3rd VSA, the storage server in frame 3 and 4 is allocated to the 3rd VSA.
Before data storage device stores the data of application program, application program is by application Mark is carried in storage request, sends to YARN.This storage request is forwarded to number by YARN According to the VSA management module 301 in storage device.
In the embodiment of the present invention, according to above-mentioned each VSA302 and each corresponding storage strategy, enter The schematic flow sheet of row date storage method is as shown in figure 4, comprise the steps:
S401:After VSA management module 301 receives the storage request of application program, determine this The application type of application program.
Specifically, VSA management module 301 receives the storage request of application program by YARN Afterwards, therefrom parse the mark of application program;Corresponding record in advance application program mark with should With the corresponding application type of the mark in type, finding out with parse, by the application finding out class Type is defined as sending the application type of the application program of storage request.
S402:VSA management module 301 and then determine the corresponding VSA302 of application type.
Specifically, VSA management module 301 is in each VSA302 being pre-created, determine with upper State the corresponding VSA302 of the application type of the application program determined in step S401.
For example, if VSA management module 301 application program determined in above-mentioned steps S401 Application type is computation-intensive application on site type, then determine in this step and computation-intensive The corresponding VSA302 of application on site type is specially a VSA.
For another example, if VSA management module 301 application program determined in above-mentioned steps S401 Application type is cold data application type, then determine relative with cold data application type in this step The VSA302 answering is specially the 3rd VSA.
It is preferred that VSA management module 301 is after determining the corresponding VSA302 of application type, also The storage strategy being directed to this VSA setting in advance can be obtained.
For example, VSA management module 301, after determining a VSA, can also obtain pin in advance To the data redundancy substrategy in the storage strategy of a VSA setting.
For another example, VSA management module 301, after determining the 3rd VSA, can also obtain pin in advance To the data redundancy substrategy in the storage strategy of the 3rd VSA setting and energy-conservation substrategy.Energy-conservation Strategy can be specifically by the strategy of electricity under the period or the strategy by electricity under label.
S403:VSA management module 301 by the data of application program, according in the VSA determining The type of storage server of configuration, the setting area of storage server are stored.
Specifically, VSA management module 301 is for the VSA302 determining in above-mentioned steps S402, From the engine program of corresponding record in advance and the type of storage server, determine with this VSA in The corresponding engine program of type of the storage server of configuration.
VSA management module 301, for the VSA302 determining in above-mentioned steps S402, is called really The engine program made, according to data layout's algorithm, will be to be stored for the application program sending storage request Data, store in this VSA in the storage server of setting area of configuration.Data layout calculates Method and the method being stored data in server according to this algorithm are ripe for those skilled in the art Know, here is omitted.
For example, VSA management module 301, for a VSA, calls the engine program determined, According to data layout's algorithm, data to be stored for the application program sending storage request stores machine In storage server in frame 1,2 and 5.
For another example, VSA management module 301, for the 3rd VSA, calls the engine program determined, According to data layout's algorithm, data to be stored for the application program sending storage request stores machine In storage server in frame 3 and 4.
It is preferred that VSA management module 301 calls the engine program determined, please for sending storage The application program asked data to be stored, according in the storage strategy determined in above-mentioned steps S402 The data trnascription number specified of data redundancy substrategy, replicated.VSA management module 301 Call the engine program determined, each data trnascription that duplication is obtained, store configuration in VSA The storage server of setting area in.
For example, VSA management module 301 is for the number in the storage strategy for a VSA setting According to redundancy substrategy however, it is determined that the data trnascription number that this data redundancy substrategy is specified is 3, then adjust With the engine program determined, by data duplication 3 to be stored for the application program sending storage request Secondary;According to data layout's algorithm, 3 data trnascriptions that duplication is obtained, store frame 1,2 With 5 in storage server in.
For another example, VSA management module 301 is for the number in the storage strategy for the 3rd VSA setting According to redundancy substrategy however, it is determined that the data trnascription number that this data redundancy substrategy is specified is zero, then not The data to be stored to the application program sending storage request replicates;It is not provided with data trnascription Meet application program the storage of cold data is asked, and save the memory space in storage server, Reduce data storage cost.And, VSA management module 301 is arranged for for the 3rd VSA Storage strategy in energy-conservation substrategy however, it is determined that this energy-conservation substrategy be specially press the period under electricity plan Slightly, then the mechanical hard disk in the storage server in frame 1,2 and 5 is pressed electricity under the period;If really This energy-conservation substrategy fixed is specially the strategy pressing electricity under label, then by the storage in frame 1,2 and 5 Each mechanical hard disk in server presses electricity under label.Cold data after storing, will very in longer period of time Few accessed;It is not accessed for the period in cold data, lower electricity is carried out to the mechanical hard disk of storage cold data, Can be with energy-conservation, further reduces cost.
For another example, VSA management module 301 is for the number in the storage strategy for the 4th VSA setting According to redundancy substrategy however, it is determined that the data trnascription number that this data redundancy substrategy is specified is 8+3, and And the backup method of data trnascription that fixed this data redundancy substrategy includes is EC method, then call with The corresponding engine program of the type of the storage server of configuration in 4th VSA, please by sending storage The application program asked data duplication to be stored 8+3 time;According to data layout's algorithm and EC method, The 8+3 data trnascription that duplication is obtained, stores in the storage server in frame 6.
Engine program that is more excellent, being called by VSA management module 301, the number of storage application program The storage location information of the data of storage according to afterwards, is returned to VSA management module 301.VSA Management module 301 is by the mark of this data and the storage location information corresponding record of this data.
More excellent, the storage location information identifying with this data of the data based on corresponding record, this Bright embodiment additionally provides a kind of data access method of distributed file system, and the flow process of the method is shown It is intended to as shown in figure 5, comprising the steps:
S501:After VSA management module 301 receives the access request of application program, therefrom parse Go out the mark of data to be visited, and determine the application class of the application program sending this access request Type.
Specifically, VSA management module 301 receives the access request of application program by YARN Afterwards, the mark of data and the mark of application program are therefrom parsed;Answering in corresponding record in advance With, in the mark of program and application type, finding out the application type corresponding with the mark parsing, The application type finding out is defined as send the application type of the application program of access request.
S502:The storage position of the corresponding data of mark of VSA management module 301 and then determination data Confidence ceases, and the corresponding VSA302 of application type.
Specifically, the mark of VSA management module 301 data of corresponding record from above-mentioned steps S403 Know in the storage location information with this data, find out the data parsing in above-mentioned steps S501 The storage location information of the corresponding data of mark.
VSA management module 301 is for the application class of the application program determined in above-mentioned steps S501 Type, and then determine the concrete grammar of the corresponding VSA302 of this application type, with above-mentioned steps S402 Middle VSA management module 301 determines that the concrete grammar of the corresponding VSA302 of application type is identical, Here is omitted.
S503:VSA manages the storage location information according to the data determined for the mould 301, and really In the VSA302 making, the type of storage server of configuration, the setting area of storage server obtain Data returns to application program.
Specifically, VSA management module 301 is for the VSA302 determining in above-mentioned steps S502, From the engine program of corresponding record in advance and the type of storage server, determine with this VSA in The corresponding engine program of type of the storage server of configuration.
VSA management module 301 calls the engine program determined, according to true in above-mentioned steps S502 The storage location information of the data made, from the VSA302 determining, the setting area of configuration deposits In storage server, after obtaining data to be visited, by the data obtaining to the application sending access request Program returns.
More excellent, the date storage method based on above-mentioned distributed file system and access method, this The block schematic illustration of the internal structure of VSA management module 301 in bright embodiment as shown in fig. 6, Including:Engine wrapper 601 and storage configuration manager 602.
Wherein, it is packaged with multiple engine programs in engine wrapper 601, each engine program is respectively used to The data storage configuring different types of storage server in each VSA accesses.
After storage configuration manager 602 is asked for the storage receiving application program, determine that this should With the application type of program, and then determine the corresponding VSA302 of this application type;Engine is called to seal The engine corresponding with the type of the storage server of configuration in the VSA302 determining in dress device 601 Program, data to be stored for application program stores the storage clothes of the setting area in this VSA In business device.
It is preferred that storage configuration manager 602 is additionally operable to determine the corresponding VSA302 of application type Afterwards, obtain the storage strategy being directed to this VSA302 setting in advance;And call with this VSA in join The corresponding engine program of the type of the storage server put, for application program data to be stored, The data trnascription number specified according to the data redundancy substrategy in the storage strategy obtaining, carries out multiple System;Each data trnascription that duplication is obtained, stores the storage service of the setting area in this VSA In device.
It is preferred that storage configuration manager 301 is additionally operable to call the storage clothes with configuration in this VSA The corresponding engine program of type of business device, data to be stored for application program stores VSA In the storage server of setting area in after, receive the data storage of this engine program return Storage location information, by the storage location information corresponding record of the mark of data storage and this data.
More excellent, as shown in fig. 6, VSA management module 301 also includes:Data access scheduler 603.
After data access scheduler 603 is used for receiving the access request of application program, therefrom parse The mark of data to be visited, and determine the application type of the application program sending this access request; And then determine the storage location information of mark this data corresponding of data, and this application type pair The VSA302 answering;According to the storage location information of the data determined, and configure in VSA302 The type of storage server, the setting area of storage server, obtain data to be visited, Xiang Fa The application program sending access request returns.
Specifically, data access scheduler 603, for the VSA302 determining, is remembered from corresponding in advance In the type of the engine program of record and storage server, determine the storage clothes with configuration in this VSA The corresponding engine program of type of business device.
Data access scheduler 603 calls the engine program determined, root from engine wrapper 601 According to the storage location information of the data determined, the setting area of configuration from the VSA302 determining Storage server in, after obtaining data to be visited, by the data obtaining to sending access request Application program returns.
Above-mentioned engine wrapper 601, storage configuration manager 602 data access scheduling device 603 work( The implementation method of energy, may be referred to the particular content of the method and step flow process as shown in above-mentioned Fig. 4-5, Here is omitted.
In the embodiment of the present invention, the multiple virtual storage region VSA in distributed file system are respectively The different application type of correspondence;VSA management module in distributed file system is according to application program Application type, determine VSA corresponding with application type, by the data of application program, according to In VSA, the type of storage server of configuration and setting area are stored.As can be seen here, the present invention In embodiment, the data of different types of application program can be stored in different setting areas and type Storage server in, the distributed file system in the embodiment of the present invention can meet all types of should With the needs to data storage for the program.
And, in the embodiment of the present invention, also it is previously provided with respective storage respectively for each VSA Strategy.Storage strategy is actually corresponding with application type, according to the application program of different application type Data, the data trnascription copying corresponding number is stored;With the data for each application program, Replicated and stored according to same data trnascription data and compared, the embodiment of the present invention meet all types of Application program to the needs of data backup on the basis of, it is possible to reduce the total number of data trnascription, thus Save memory space, reduces cost.
Further, in the embodiment of the present invention, for the VSA corresponding with cold data application type, pin Energy-conservation substrategy is also included in storage strategy to this VSA setting;VSA management module can basis This energy-conservation substrategy, carries out power-on and power-off to the mechanical hard disk being stored with the storage server of cold data, To save electric energy, further reduces cost.
Additionally, in the embodiment of the present invention, being independent mutually between VSA.Therefore, the present invention is implemented New VSA can easily be added in distributed file system in example, and join for new VSA Put new application type, the type (for example there is new storage medium) of new storage server or The setting area of the new storage server of person, thus have good autgmentability.
Those skilled in the art of the present technique are appreciated that the present invention includes relating to execute institute in the application State the equipment of one or more in operation.These equipment specially can design for required purpose and Manufacture, or the known device in general purpose computer can also be included.These equipment have and are stored in it Interior computer program, these computer programs optionally activate or reconstruct.Such computer journey Sequence can be stored in equipment (for example, computer) computer-readable recording medium or be stored in and be suitable to storage electricity Sub-instructions are simultaneously coupled in any kind of medium of bus respectively, and described computer-readable medium includes But be not limited to any kind of disk (including floppy disk, hard disk, CD, CD-ROM and magneto-optic disk), ROM (Read-Only Memory, read only memory), RAM (Random Access Memory, Memorizer immediately), (Erasable Programmable Read-Only Memory, can for EPROM Erasable programmable read only memory), EEPROM (Electrically Erasable Programmable Read-Only Memory, EEPROM), flash memory, magnetic card or light Card.It is, computer-readable recording medium includes being deposited in the form of can reading by equipment (for example, computer) Storage or any medium of transmission information.
Those skilled in the art of the present technique are appreciated that and can realize these knots with computer program instructions Each frame in composition and/or block diagram and/or flow graph and these structure charts and/or block diagram and/or flow graph In frame combination.Those skilled in the art of the present technique are appreciated that and can refer to these computer programs The processor that order is supplied to general purpose computer, special purpose computer or other programmable data processing methods comes Realize, thus the present invention is executed by the processor of computer or other programmable data processing methods The scheme specified in the frame of disclosed structure chart and/or block diagram and/or flow graph or multiple frame.
Those skilled in the art of the present technique are appreciated that various operations, the side having discussed in the present invention Step in method, flow process, measure, scheme can be replaced, changed, combined or deleted.Further Ground, has the various operations having discussed in the present invention, method, other steps in flow process, arranges Apply, scheme can also be replaced, changes, resets, decomposes, combines or deletes.Further, existing Have in technology have with the step in the various operations disclosed in the present invention, method, flow process, measure, Scheme can also be replaced, changed, reset, decomposed, combined or deleted.
The above is only some embodiments of the present invention it is noted that for the art For those of ordinary skill, under the premise without departing from the principles of the invention, some improvement can also be made And retouching, these improvements and modifications also should be regarded as protection scope of the present invention.

Claims (12)

1. a kind of date storage method of distributed file system is it is characterised in that include:
After virtual storage region VSA management module receives the storage request of application program, determine The application type of described application program;
Described VSA management module and then determine the corresponding VSA of described application type;
Described VSA management module by the data of described application program, according to configuration in described VSA The type of storage server, the setting area of storage server are stored.
2. method according to claim 1 is it is characterised in that the application of described application program One of specifically following application type of type:
Offline batch application type, computation-intensive application on site type, cold data application type, Input and output I/O intensive type application on site type, real-time application type.
3. method according to claim 2 is it is characterised in that described computation-intensive is online Application type is corresponding with a VSA;In first VSA, the type of the storage server of configuration is machinery Hard disk server type;And
Described offline batch application type is corresponding with the 2nd VSA;The storage of configuration in 2nd VSA The type of server is mechanical hard disk type of server;And
Described cold data application type is corresponding with the 3rd VSA;The storage service of configuration in 3rd VSA The type of device is mechanical hard disk type of server;And
Described I/O intensive type application on site type is corresponding with the 4th VSA;Configuration in 4th VSA The type of storage server is solid state hard disc or caching server type;And
Described real-time application type is corresponding with the 5th VSA;The storage server of configuration in 5th VSA Type be Large Copacity inner server type.
4. the method according to any one of claim 1-3 is it is characterised in that described in described basis In VSA, the type of storage server of configuration, the setting area of storage server are stored, specifically Including:
Described VSA management module determines the type phase with the storage server of configuration in described VSA Corresponding engine program;
Call the engine program determined by data to be stored for described application program, store described setting Put in the storage server in region.
5. method according to claim 4 is it is characterised in that described determine described application After the corresponding VSA of type, also include:
Described VSA management module obtains the storage strategy being directed to described VSA setting in advance;And
Described engine program, by data to be stored for described application program, stores described setting area In storage server, specifically include:
Described VSA management module calls described engine program, to be stored for described application program Data, the data trnascription number specified according to the data redundancy substrategy in described storage strategy, carry out Replicate;And each data trnascription obtaining will be replicated, store in the storage server of described setting area.
6. method according to claim 4 is it is characterised in that described engine program will be described Application program data to be stored, after storing in the storage server of described setting area, also wraps Include:
Described engine program returns the storage location information of described data;
Described VSA management module will be corresponding with the storage location information of this data for the mark of described data Record.
7. method according to claim 6 is it is characterised in that also include:
After described VSA management module receives the access request of application program, therefrom parse and wait to visit The mark of the data asked, and determine the application type of the application program sending this access request;
The storage of the corresponding described data of mark of described VSA management module and then the described data of determination Positional information, and the corresponding VSA of described application type;
Described VSA management module according to the storage location information of described data, and in described VSA The type of storage server of configuration, the setting area of storage server obtain described data and answer to described Returned with program.
8. a kind of data storage device of distributed file system is it is characterised in that include:
Multiple virtual storage region VSA, each VSA correspond to different application types respectively;
VSA management module, after the storage for receiving application program is asked, determines described application The application type of program, and then determine the corresponding VSA of described application type;By described application program Data, according to the setting of the type of storage server of configuration, storage server in described VSA Region is stored.
9. device according to claim 8 is it is characterised in that described VSA management module, Including:
Engine wrapper, is wherein packaged with multiple engine programs, is respectively used to the storage of variant type The data storage of server accesses;
Storage configuration manager, after the storage for receiving application program is asked, determines described answering With the application type of program, and then determine the corresponding VSA of described application type;Call described engine The engine program corresponding with the type of the storage server of configuration in described VSA in wrapper, will Described application program data to be stored, stores in the storage server of described setting area.
10. device according to claim 9 it is characterised in that
After described storage configuration manager is additionally operable to determine the corresponding VSA of described application type, obtain Take the storage strategy being directed to described VSA setting in advance;And call with described VSA in configuration deposit The corresponding engine program of type of storage server, for described application program data to be stored, according to The data trnascription number specified according to the data redundancy substrategy in described storage strategy, is replicated;Will Replicate each data trnascription obtaining, store in the storage server of described setting area.
11. devices according to claim 10 it is characterised in that
Described storage configuration manager is additionally operable to call described engine program to wait to deposit by described application program The data of storage, after storing in the storage server of described setting area, receives described engine program The storage location information of the described data returning, by the storage location of the mark of described data and this data Information corresponding record.
12. devices according to claim 11 it is characterised in that described VSA management module, Also include:
Data access scheduler, for receiving after the access request of application program, therefrom parses and treats The mark of the data accessing, and determine the application type of the application program sending this access request;Enter And determine the storage location information of the corresponding described data of mark of described data, and described application The corresponding VSA of type;According to the storage location information of described data, and configure in described VSA The type of storage server, storage server setting area obtain described data to described application journey Sequence returns.
CN201510532946.7A 2015-08-27 2015-08-27 The date storage method of distributed file system and device Pending CN106484712A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510532946.7A CN106484712A (en) 2015-08-27 2015-08-27 The date storage method of distributed file system and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510532946.7A CN106484712A (en) 2015-08-27 2015-08-27 The date storage method of distributed file system and device

Publications (1)

Publication Number Publication Date
CN106484712A true CN106484712A (en) 2017-03-08

Family

ID=58234563

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510532946.7A Pending CN106484712A (en) 2015-08-27 2015-08-27 The date storage method of distributed file system and device

Country Status (1)

Country Link
CN (1) CN106484712A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109302382A (en) * 2018-08-29 2019-02-01 山东超越数控电子股份有限公司 A kind of construction method and system of polynary isomery storage service management platform
CN111787008A (en) * 2020-06-30 2020-10-16 北京指掌易科技有限公司 Access control method, device, electronic equipment and computer readable storage medium
CN113835616A (en) * 2020-06-23 2021-12-24 华为技术有限公司 Applied data management method, system and computer device
CN114579560A (en) * 2020-12-01 2022-06-03 中移(苏州)软件技术有限公司 Data platform and application method thereof

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN202634482U (en) * 2012-03-08 2012-12-26 西安跃腾电子科技有限责任公司 Core configuration of college cloud calculation common information service platform and system application
CN102938784A (en) * 2012-11-06 2013-02-20 无锡江南计算技术研究所 Method and system used for data storage and used in distributed storage system
CN103593262A (en) * 2013-11-15 2014-02-19 上海爱数软件有限公司 Virtual machine backup method based on classification
CN103853633A (en) * 2014-02-14 2014-06-11 上海爱数软件有限公司 Application program injection type backup method based on operation information application discovery of virtual machine
CN104598495A (en) * 2013-10-31 2015-05-06 南京中兴新软件有限责任公司 Hierarchical storage method and system based on distributed file system
CN104782134A (en) * 2012-11-15 2015-07-15 日本电气株式会社 Server device, terminal, thin client system, screen transmission method and program

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN202634482U (en) * 2012-03-08 2012-12-26 西安跃腾电子科技有限责任公司 Core configuration of college cloud calculation common information service platform and system application
CN102938784A (en) * 2012-11-06 2013-02-20 无锡江南计算技术研究所 Method and system used for data storage and used in distributed storage system
CN104782134A (en) * 2012-11-15 2015-07-15 日本电气株式会社 Server device, terminal, thin client system, screen transmission method and program
CN104598495A (en) * 2013-10-31 2015-05-06 南京中兴新软件有限责任公司 Hierarchical storage method and system based on distributed file system
CN103593262A (en) * 2013-11-15 2014-02-19 上海爱数软件有限公司 Virtual machine backup method based on classification
CN103853633A (en) * 2014-02-14 2014-06-11 上海爱数软件有限公司 Application program injection type backup method based on operation information application discovery of virtual machine

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109302382A (en) * 2018-08-29 2019-02-01 山东超越数控电子股份有限公司 A kind of construction method and system of polynary isomery storage service management platform
CN113835616A (en) * 2020-06-23 2021-12-24 华为技术有限公司 Applied data management method, system and computer device
WO2021258881A1 (en) * 2020-06-23 2021-12-30 华为技术有限公司 Data management method and system for application, and computer device
CN113835616B (en) * 2020-06-23 2025-06-13 华为技术有限公司 Application data management method, system and computer device
CN111787008A (en) * 2020-06-30 2020-10-16 北京指掌易科技有限公司 Access control method, device, electronic equipment and computer readable storage medium
CN111787008B (en) * 2020-06-30 2023-01-20 北京指掌易科技有限公司 Access control method, device, electronic equipment and computer readable storage medium
CN114579560A (en) * 2020-12-01 2022-06-03 中移(苏州)软件技术有限公司 Data platform and application method thereof

Similar Documents

Publication Publication Date Title
CN102246489B (en) Systems and methods for connection management for asynchronous messaging over http
EP2648114B1 (en) Method, system, token conreoller and memory database for implementing distribute-type main memory database system
CN101626398B (en) Method for obtaining friend dynamic and system
CN111258978B (en) Data storage method
WO2010072083A1 (en) Web application based database system and data management method therof
CN110336871A (en) A kind of document handling method, device, storage medium and electronic equipment
CN102217273A (en) Systems and methods for application fluency policies
CN105393243A (en) Transaction ordering
CN110213352A (en) The unified Decentralized Autonomous storage resource polymerization of name space
CN106484712A (en) The date storage method of distributed file system and device
CN107888666A (en) A kind of cross-region data-storage system and method for data synchronization and device
CN101771723A (en) Data synchronization method
CN104156300A (en) Log management system and log management method
CN100536472C (en) A method, module and server to control access to network resource
CN106933868A (en) A kind of method and data server for adjusting data fragmentation distribution
CN113885797B (en) Data storage method, device, equipment and storage medium
Mortazavi et al. Sessionstore: A session-aware datastore for the edge
CN101789963A (en) Data synchronization system
CN117938863B (en) Cluster-based joint simulation implementation method, system, equipment and storage medium
CN103442034A (en) Streaming media service method and system based on cloud computing technology
US7693840B1 (en) Method and system for distribution of common elements
CN103389986A (en) Method, device and system for storing and searching Session information
Nath et al. IrisNet: An architecture for compute-intensive wide-area sensor network services
CN209765499U (en) media fusion technology platform system based on APP
CN107528797B (en) Data processing method, device and system

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20170308

RJ01 Rejection of invention patent application after publication