CN106484712A - The date storage method of distributed file system and device - Google Patents
The date storage method of distributed file system and device Download PDFInfo
- Publication number
- CN106484712A CN106484712A CN201510532946.7A CN201510532946A CN106484712A CN 106484712 A CN106484712 A CN 106484712A CN 201510532946 A CN201510532946 A CN 201510532946A CN 106484712 A CN106484712 A CN 106484712A
- Authority
- CN
- China
- Prior art keywords
- vsa
- data
- type
- storage
- application
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000003860 storage Methods 0.000 title claims abstract description 207
- 238000000034 method Methods 0.000 title claims abstract description 58
- 238000007726 management method Methods 0.000 claims abstract description 68
- 238000013500 data storage Methods 0.000 claims abstract description 28
- 239000007787 solid Substances 0.000 claims description 5
- 238000010586 diagram Methods 0.000 description 12
- 230000008569 process Effects 0.000 description 12
- 230000005611 electricity Effects 0.000 description 11
- 238000004134 energy conservation Methods 0.000 description 10
- 238000012545 processing Methods 0.000 description 8
- 230000006854 communication Effects 0.000 description 7
- 238000004891 communication Methods 0.000 description 7
- 238000005516 engineering process Methods 0.000 description 6
- 238000004590 computer program Methods 0.000 description 4
- 238000009826 distribution Methods 0.000 description 4
- 230000006870 function Effects 0.000 description 3
- 230000008901 benefit Effects 0.000 description 2
- 230000008878 coupling Effects 0.000 description 2
- 238000010168 coupling process Methods 0.000 description 2
- 238000005859 coupling reaction Methods 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- 230000006872 improvement Effects 0.000 description 2
- 239000000203 mixture Substances 0.000 description 2
- 238000003672 processing method Methods 0.000 description 2
- 230000032683 aging Effects 0.000 description 1
- 230000006399 behavior Effects 0.000 description 1
- 230000007175 bidirectional communication Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000006855 networking Effects 0.000 description 1
- 238000003825 pressing Methods 0.000 description 1
- 238000011084 recovery Methods 0.000 description 1
- 230000004044 response Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/17—Details of further file system functions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/17—Details of further file system functions
- G06F16/1737—Details of further file system functions for reducing power consumption or coping with limited storage space, e.g. in mobile devices
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/18—File system types
- G06F16/182—Distributed file systems
- G06F16/1824—Distributed file systems implemented using Network-attached Storage [NAS] architecture
- G06F16/183—Provision of network file services by network file servers, e.g. by using NFS, CIFS
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Embodiments provide date storage method and the device of distributed file system, methods described includes:After virtual storage region VSA management module receives the storage request of application program, determine the application type of described application program;Described VSA management module and then determine the corresponding VSA of described application type;Described VSA management module, by the data of described application program, is stored according to the type of storage server of configuration, the setting area of storage server in described VSA.In the embodiment of the present invention, the data of different types of application program can be stored in different setting areas and the storage server of type, and the distributed file system in the embodiment of the present invention can meet the needs to data storage for all types of application programs.
Description
Technical field
The present invention relates to field of computer technology, specifically, the present invention relates to a kind of distributed document
The date storage method of system and device.
Background technology
With the arrival in big data epoch, the process of big data is more and more important.Hadoop is one
By the distributed system architecture of Apache fund club exploitation, Hadoop system is at present more
Popular distributed processing system(DPS), and it is increasingly becoming big data process de facto standards.Hadoop
Have using simple, increase income, low cost the features such as.Architecturally, Hadoop have highly reliable,
High extension, high flexible feature, support the data processing of PB level.
The initial design object of Hadoop system be by Map-Reduce (mapping-reduction) technology Lai
Realize off-line data is rapidly processed.But rapid with the application program processing big data is sent out
Exhibition, the different types of application program such as online data process, real time data processing and cold data process is opened
Beginning occurs;Compared with processing with off-line data, at online data process, real time data processing and cold data
Reason is all very different in terms of the access delay of data, calculating demand and cost sensitive degree,
Hadoop system cannot tackle different types of application journey by single Map-Reduce technology
Sequence.
At present, technical staff develops YARN (Yet Another Resource for Hadoop system
Negotiator, another kind of resource coordination person) realizing different types of application program to cluster resource
Distribution, scheduling and recovery, be various types of application programs reasonably Resources allocation, thus realizing
Same Hadoop system supports polytype application program.For example, a kind of Hadoop system
System structural framework schematic diagram is as shown in figure 1, Hadoop achieves traditional resource pipe by YARN
Reason and the loose coupling of the batch application program based on Map-Reduce, support polytype application journey
The scheduling of resource of sequence, thus support exploitation and the management of multiple application programs.
At present, Hadoop system still adopt HDFS (Hadoop Distributed File System,
Distributed file system) data stored and is accessed.A kind of schematic diagram of the framework of HDFS is such as
Shown in Fig. 2, this HDFS employs host-guest architecture, by a name node NameNode and some
Individual back end DataNode composition.Name node represents that master server is used for managing file system
NameSpace (such as title, path etc.) and client Client operate to the access of file;Data
Node represents that storage server is used for managing the data of storage.
However, it was found by the inventors of the present invention that existing YARN only supports for different applications
Program, the distribution CPU and Memory resources different with scheduling;And existing YARN is various
Application program provides unified date storage method, including:Support the write-once of mass data,
Repeatedly read;Do not support the renewal of written data, but allow to add new data in tail of file;
Copy data for data configuration uniform amount;Do not differentiate between the difference of the hardware configuration between server,
By the data random storage of different application in the server.
The date storage method of existing Hadoop system can not meet polytype application program
Needs to data storage.For example, if the data storage of I/O intensive type application on site is taken common
In business device, it is used for data storage because common server is generally only equipped with mechanical hard disk, and hard from machinery
The speed that data read and write by disk is slower;And I/O intensive type application on site typically require frequent at short notice
The data of ground read-write storage, the mechanical hard disk in common server will become speed bottle-neck, serious system
The response speed of about I/O intensive type application on site.And if by all services in HDFS file system
Device be all replaced with the server equipped with solid-state hard disk SSD although can meet I/O intensive type
The needs to data storage for the line application, but with high costs;And, for traditional offline batch processing
For application, common server can meet its data storage needs, and configuration has solid-state hard disk SSD
Server be a kind of wasting of resources.
Therefore, it is necessary to provide a kind of date storage method of more flexible distributed file system and dress
Put, to meet the needs to data storage for the polytype application program.
Content of the invention
The present invention is directed to the shortcoming of the data storage method of existing distributed file system, proposes a kind of
The date storage method of distributed file system and device, exist and can not meet in order to solve prior art
The problem of the needs to data storage for polytype application program.
The embodiment of the present invention is according on one side, there is provided a kind of data storage of distributed file system
Method, including:
After virtual storage region VSA management module receives the storage request of application program, determine
The application type of described application program;
Described VSA management module and then determine the corresponding VSA of described application type;
Described VSA management module by the data of described application program, according to configuration in described VSA
The type of storage server, the setting area of storage server are stored.
The embodiment of the present invention, according on the other hand, additionally provides a kind of data of distributed file system
Storage device, including:
Multiple virtual storage region VSA, each VSA correspond to different application types respectively;
VSA management module, after the storage for receiving application program is asked, determines described application
The application type of program, and then determine the corresponding VSA of described application type;By described application program
Data, according to the setting of the type of storage server of configuration, storage server in described VSA
Region is stored.
In the embodiment of the present invention, the multiple virtual storage region VSA in distributed file system are respectively
The different application type of correspondence;VSA management module in distributed file system is according to application program
Application type, determine VSA corresponding with application type, by the data of application program, according to
In VSA, the type of storage server of configuration and setting area are stored.As can be seen here, the present invention
In embodiment, the data of different types of application program can be stored in different setting areas and type
Storage server in, the distributed file system in the embodiment of the present invention can meet all types of should
With the needs to data storage for the program.
The aspect that the present invention adds and advantage will be set forth in part in the description, and these will be from following
Description in become obvious, or recognized by the practice of the present invention.
Brief description
The above-mentioned and/or additional aspect of the present invention and advantage are from retouching to embodiment below in conjunction with the accompanying drawings
Will be apparent from stating with easy to understand, wherein:
Fig. 1 is the system structural framework schematic diagram of the Hadoop system of prior art;
Fig. 2 is a kind of configuration diagram of HDFS of prior art;
Fig. 3 a is the configuration diagram of the distributed file system of the embodiment of the present invention;
Fig. 3 b is the block schematic illustration of the internal structure of data storage device of the embodiment of the present invention;
Fig. 3 c is the corresponding pass between the application type of the embodiment of the present invention and the type of storage server
The schematic diagram of one example of system;
Fig. 3 d is an example of the setting area of the storage server of each VSA of the embodiment of the present invention
Schematic diagram;
Fig. 4 is the schematic flow sheet of the date storage method of the embodiment of the present invention;
Fig. 5 is the schematic flow sheet of the data access method of the embodiment of the present invention;
Fig. 6 is the block schematic illustration of the internal structure of VSA management module of the embodiment of the present invention.
Specific embodiment
Embodiments of the invention are described below in detail, the example of described embodiment is shown in the drawings, its
In from start to finish same or similar label represent same or similar element or have same or like
The element of function.Embodiment below with reference to Description of Drawings is exemplary, is only used for explaining this
Invention, and be not construed as limiting the claims.
Those skilled in the art of the present technique are appreciated that unless expressly stated, singulative used herein
" one ", " one ", " described " and " being somebody's turn to do " may also comprise plural form.It is to be further understood that
Used in the description of the present invention, wording " inclusion " refers to there is described feature, integer, step, behaviour
Make, element and/or assembly, but it is not excluded that there are or add other features one or more, whole
Number, step, operation, element, assembly and/or their group.It should be understood that when we claim element
It is " connected " or during " coupled " to another element, it can be directly connected or coupled to other elements, or
Can also there is intermediary element in person.Additionally, " connection " used herein or " coupling " can include wirelessly
Connect or wirelessly couple.Wording "and/or" used herein includes one or more associated listing
The whole or any cell of item and whole combination.
Those skilled in the art of the present technique are appreciated that unless otherwise defined, all terms used herein
(including technical term and scientific terminology), have and the those of ordinary skill in art of the present invention
General understanding identical meaning.It should also be understood that those arts defined in such as general dictionary
Language is it should be understood that have the meaning consistent with the meaning in the context of prior art, and removes
Non- as here by specific definitions, otherwise will not be explained with idealization or excessively formal implication.
Those skilled in the art of the present technique be both appreciated that " terminal " used herein above, " terminal unit "
Including the equipment of wireless signal receiver, it only possesses setting of the wireless signal receiver of non-emissive ability
Standby, include receiving and launch the equipment of hardware, it has and can carry out on bidirectional communication link again
The reception of two-way communication and the equipment of transmitting hardware.This equipment can include:Honeycomb or other communications
Equipment, its have single line display or multi-line display or the honeycomb not having multi-line display or
Other communication equipments;PCS (Personal Communications Service, PCS Personal Communications System),
It can be with combine voice, data processing, fax and/or its communication ability;PDA(Personal
Digital Assistant, personal digital assistant), it can include radio frequency receiver, pager, mutually
The access of networking/Intranet, web browser, notepad, calendar and/or GPS (Global Positioning
System, global positioning system) receptor;Conventional laptop and/or palmtop computer or other set
Standby, its have and/or include the conventional laptop of radio frequency receiver and/or palmtop computer or other
Equipment." terminal " used herein above, " terminal unit " can be portable, can transport, be arranged on
In the vehicles (aviation, sea-freight and/or land), or it is suitable for and/or is configured to local
Run, and/or with distribution form, operate in the earth and/or any other position in space is run.This
In " terminal ", " terminal unit " that used can also be communication terminal, access terminals, music/video
Playback terminal, for example, can be PDA, MID (Mobile Internet Device, mobile Internet
Equipment) and/or there is mobile phone or intelligent television, the machine of music/video playing function
The equipment such as top box.
The present inventor is it is considered that can increase in the distributed file system of the embodiment of the present invention
If VSA management module and multiple VSA (Virtual Storage Area, virtual storage region), respectively
VSA corresponds to different application types respectively.VSA management module can send according to application program
Storage request, determines the application type of application program, and then determines the corresponding VSA of application type;
This VSA management module by the data of application program, according to the class of the storage server of configuration in VSA
Type, the equipment region of storage server are stored.As can be seen here, in the embodiment of the present invention, different
The data of the application program of application type can be stored in different types of storage server, permissible
Meet application program each needs to data storage of each application type.
And, in the embodiment of the present invention, also it is previously provided with respective storage respectively for each VSA
Strategy.VSA management module can be according to the data trnascription number specified in storage strategy, to application journey
Sequence data to be stored is replicated and is stored.It can be seen that, in the embodiment of the present invention, storage strategy is actual
Upper corresponding with application type, according to the application data of different application type, copy corresponding number
Purpose data trnascription is stored;With the data for each application program, according to same data trnascription
Data is replicated and is stored and compared, and the embodiment of the present invention is meeting all types of application programs to data
On the basis of the needs of backup, it is possible to reduce the total number of data trnascription, thus saving memory space, fall
Low cost.
Specifically introduce the technical scheme of the embodiment of the present invention below in conjunction with the accompanying drawings.
In the embodiment of the present invention, the framework schematic diagram of the distributed file system based on HDFS is as schemed
Shown in 3a, including:Primary server joint and multiple storage server node.
Wherein, each storage server node is all passed through the Internet and is connected with primary server joint.Main service
Device node can be specifically Namenode node;Storage server node can be specifically DataNode
Node.
Metadata management system in primary server joint is used for managing the NameSpace of file and adjusts visitor
Family end accesses file, and the data storage managing each storage server node.This metadata management system
Include:Data storage device.
The block schematic illustration of the internal structure of data storage device as shown in Figure 3 b, can include:VSA
Management module 301 and multiple VSA302.
Wherein, VSA management module 301 is used for creating and safeguard each VSA302, is directed in advance
VSA302 configuration corresponding strategies and interacted by YARN and each application program etc., are had
Body function will subsequently describe in detail.
VSA302 includes:The mark of this VSA, the title of application program and application type, storage
The type of server, and the setting area of storage server.Wherein, the setting area of storage server
Domain can be the rack or server node set belonging to this storage server.VSAn in Fig. 1
Represent n-th VSA302, n is natural number.
In the embodiment of the present invention, before data storage device stores the data of application program, technology people
Member can collect substantial amounts of application program in advance, and each application program is classified, and obtains each application
The application type of program;Application type corresponding record by the mark of application program and this application program.
It is preferred that the application type of an application program one of can be specifically following application all types of:
Offline batch application type, computation-intensive application on site type, cold data application type, input
Output I/O intensive type application on site type, real-time application type.
Technical staff, for each application type, creates VSA302 corresponding with this application type, and
Configure type and the setting area of storage server according to this application type for VSA302.Specifically,
For each VSA302, by the title of the application program corresponding to the mark of this VSA, this VSA
With application type and according to this application type be this VSA distribution storage server type and
Setting area corresponding record is in this VSA.
It is preferred that after creating multiple VSA302, by computation-intensive application on site type, criticizing offline
Process application type, cold data application type, I/O intensive type application on site type and real-time application class
Type, respectively with a VSA (VSA1), the 2nd VSA (VSA2), the 3rd VSA (VSA3)、
4th VSA (VSA4) and the 5th VSA (VSA5) corresponding.
And, the present inventor is it is considered that the application program of offline batch application type is usual
Do not need to respond online;The application program of computation-intensive application on site is for processor (such as CPU)
Disposal ability require higher, but by not needing continually to read and write to be stored in storage server
Data;The application program of cold data application type, generally will after data storage, not visit again or very
Few data accessing storage;It can be seen that, the application program of above-mentioned three kinds of application types, for data
Read or write speed requires all relatively low.The application program of I/O intensive type application on site type, online big data
During process, need continually to read and write the data being stored in storage server, need higher data
Read or write speed.The application program of real-time application type, when real-time big data is processed, for data
The reading and writing data that the application program of read or write speed, significantly larger than I/O intensive type application on site type needs
Speed.
Therefore, it can the class of the storage server in a VSA, the 2nd VSA and the 3rd VSA
Type is all configured to mechanical hard disk type of server;The type of the storage server in the 4th VSA is joined
It is set to solid state hard disc or caching server type;The type of the storage server in the 5th VSA is joined
It is set to Large Copacity inner server type.
For example, in the embodiment of the present invention, the application type of application program and the type of storage server it
Between the schematic diagram of corresponding relation can be as shown in Figure 3 c.In Fig. 3 c, online should with computation-intensive
Answer with the type of the storage server in the corresponding VSA of type and with offline batch processing
Type with the storage server in the 2nd corresponding VSA of type is all mechanical hard disk server category
Type;The type of the storage server in the 3rd VSA corresponding with cold data application type is machinery
Hard disk server type;Storage in the 4th VSA corresponding with I/O intensive type application on site type
The type of server is solid state hard disc or caching server type;Corresponding with real-time application type
The type of the storage server in five VSA is Large Copacity inner server type.
It is preferred that for each VSA302, for the storage server setting area in each VSA.
For example, in Fig. 3 d, VSA1、VSA2、VSA3、VSA4And VSA5Respectively represent first,
2nd, three, four and five VSA;HDD, SSD and RAM represent respectively mechanical hard disk type of server,
Solid state hard disc or caching server type and Large Copacity inner server type;Frame 1-7 represents respectively
7 different regions that storage server is located.VSA1And VSA2The storage server of middle configuration
It is all disposed within frame 1,2 and 5;In VSA3, the storage server of configuration is all disposed within frame 3
In 4;In VSA4, the storage server of configuration is arranged in frame 6;In VSA5, configuration deposits
Storage server is arranged in frame 7.
It is preferred that being packaged with multiple engine programs in VSA management module 301, each engine program is set
The data storage data being respectively used to the different types of storage server of configuration in each VSA accesses
Afterwards, by the type corresponding record of engine program and the storage server of configuration in VSA.
For example, in Fig. 3 d, engine 1 and VSA1And VSA2The mechanical hard disk server category of middle configuration
Type is corresponding;Engine 2 and VSA3The mechanical hard disk type of server of middle configuration is corresponding;Engine 3
With VSA4The solid state hard disc of middle configuration or caching server type are corresponding;Engine 4 and VSA5In
The Large Copacity inner server type of configuration is corresponding.
It is preferred that before data storage device stores the data of application program, technical staff is acceptable
It is directed to each VSA302 in advance and storage strategy is set.Storage strategy includes data redundancy substrategy;
Can include in data redundancy substrategy:Data trnascription number.
Specifically, technical staff generally can be to the important journey of the data of the application program of each application type
Degree is classified;For each VSA302, generally can be according to the application type corresponding to this VSA
Application data significance level rank it is intended that the data trnascription number of this VSA.It is preferred that
For the higher data of significance level rank it is intended that larger data trnascription number;For important procedure
The relatively low data of rank is it is intended that less data trnascription number or specified data trnascription number are zero.
More excellent, data redundancy substrategy can also include:The backup method of data trnascription.For example,
For the data specifying larger data trnascription number, EC method can be configured each to this data
Data trnascription is stored.Wherein, EC method is well known to those skilled in the art, herein no longer
Repeat.
It is ageing that the present inventor is also noted that cold data generally has, and cold data is being stored
This cold data will seldom be accessed after the long period in storage server, therefore can be by hard to machinery
Under disk, electricity carries out energy-conservation, reduces the operation costs of the distributed file system in the embodiment of the present invention.
Therefore, technical staff is for the VSA302 corresponding with cold data application type, can also be
For configuration section energy substrategy in the storage strategy of this VSA setting;This energy-conservation substrategy is specifically permissible
It is by the strategy of electricity under the period or the strategy by electricity under label.
Further, for the VSA302 corresponding with cold data application type, can also be this VSA
Configure the configuring area of independent storage server.
For example, for a VSA corresponding with computation-intensive application on site type and with from
Storage in frame 1,2 and 5 can be taken by the 2nd corresponding VSA of line batch application type
Business device is allocated to a VSA and the 2nd VSA;And for corresponding with cold data application type
3rd VSA, the storage server in frame 3 and 4 is allocated to the 3rd VSA.
Before data storage device stores the data of application program, application program is by application
Mark is carried in storage request, sends to YARN.This storage request is forwarded to number by YARN
According to the VSA management module 301 in storage device.
In the embodiment of the present invention, according to above-mentioned each VSA302 and each corresponding storage strategy, enter
The schematic flow sheet of row date storage method is as shown in figure 4, comprise the steps:
S401:After VSA management module 301 receives the storage request of application program, determine this
The application type of application program.
Specifically, VSA management module 301 receives the storage request of application program by YARN
Afterwards, therefrom parse the mark of application program;Corresponding record in advance application program mark with should
With the corresponding application type of the mark in type, finding out with parse, by the application finding out class
Type is defined as sending the application type of the application program of storage request.
S402:VSA management module 301 and then determine the corresponding VSA302 of application type.
Specifically, VSA management module 301 is in each VSA302 being pre-created, determine with upper
State the corresponding VSA302 of the application type of the application program determined in step S401.
For example, if VSA management module 301 application program determined in above-mentioned steps S401
Application type is computation-intensive application on site type, then determine in this step and computation-intensive
The corresponding VSA302 of application on site type is specially a VSA.
For another example, if VSA management module 301 application program determined in above-mentioned steps S401
Application type is cold data application type, then determine relative with cold data application type in this step
The VSA302 answering is specially the 3rd VSA.
It is preferred that VSA management module 301 is after determining the corresponding VSA302 of application type, also
The storage strategy being directed to this VSA setting in advance can be obtained.
For example, VSA management module 301, after determining a VSA, can also obtain pin in advance
To the data redundancy substrategy in the storage strategy of a VSA setting.
For another example, VSA management module 301, after determining the 3rd VSA, can also obtain pin in advance
To the data redundancy substrategy in the storage strategy of the 3rd VSA setting and energy-conservation substrategy.Energy-conservation
Strategy can be specifically by the strategy of electricity under the period or the strategy by electricity under label.
S403:VSA management module 301 by the data of application program, according in the VSA determining
The type of storage server of configuration, the setting area of storage server are stored.
Specifically, VSA management module 301 is for the VSA302 determining in above-mentioned steps S402,
From the engine program of corresponding record in advance and the type of storage server, determine with this VSA in
The corresponding engine program of type of the storage server of configuration.
VSA management module 301, for the VSA302 determining in above-mentioned steps S402, is called really
The engine program made, according to data layout's algorithm, will be to be stored for the application program sending storage request
Data, store in this VSA in the storage server of setting area of configuration.Data layout calculates
Method and the method being stored data in server according to this algorithm are ripe for those skilled in the art
Know, here is omitted.
For example, VSA management module 301, for a VSA, calls the engine program determined,
According to data layout's algorithm, data to be stored for the application program sending storage request stores machine
In storage server in frame 1,2 and 5.
For another example, VSA management module 301, for the 3rd VSA, calls the engine program determined,
According to data layout's algorithm, data to be stored for the application program sending storage request stores machine
In storage server in frame 3 and 4.
It is preferred that VSA management module 301 calls the engine program determined, please for sending storage
The application program asked data to be stored, according in the storage strategy determined in above-mentioned steps S402
The data trnascription number specified of data redundancy substrategy, replicated.VSA management module 301
Call the engine program determined, each data trnascription that duplication is obtained, store configuration in VSA
The storage server of setting area in.
For example, VSA management module 301 is for the number in the storage strategy for a VSA setting
According to redundancy substrategy however, it is determined that the data trnascription number that this data redundancy substrategy is specified is 3, then adjust
With the engine program determined, by data duplication 3 to be stored for the application program sending storage request
Secondary;According to data layout's algorithm, 3 data trnascriptions that duplication is obtained, store frame 1,2
With 5 in storage server in.
For another example, VSA management module 301 is for the number in the storage strategy for the 3rd VSA setting
According to redundancy substrategy however, it is determined that the data trnascription number that this data redundancy substrategy is specified is zero, then not
The data to be stored to the application program sending storage request replicates;It is not provided with data trnascription
Meet application program the storage of cold data is asked, and save the memory space in storage server,
Reduce data storage cost.And, VSA management module 301 is arranged for for the 3rd VSA
Storage strategy in energy-conservation substrategy however, it is determined that this energy-conservation substrategy be specially press the period under electricity plan
Slightly, then the mechanical hard disk in the storage server in frame 1,2 and 5 is pressed electricity under the period;If really
This energy-conservation substrategy fixed is specially the strategy pressing electricity under label, then by the storage in frame 1,2 and 5
Each mechanical hard disk in server presses electricity under label.Cold data after storing, will very in longer period of time
Few accessed;It is not accessed for the period in cold data, lower electricity is carried out to the mechanical hard disk of storage cold data,
Can be with energy-conservation, further reduces cost.
For another example, VSA management module 301 is for the number in the storage strategy for the 4th VSA setting
According to redundancy substrategy however, it is determined that the data trnascription number that this data redundancy substrategy is specified is 8+3, and
And the backup method of data trnascription that fixed this data redundancy substrategy includes is EC method, then call with
The corresponding engine program of the type of the storage server of configuration in 4th VSA, please by sending storage
The application program asked data duplication to be stored 8+3 time;According to data layout's algorithm and EC method,
The 8+3 data trnascription that duplication is obtained, stores in the storage server in frame 6.
Engine program that is more excellent, being called by VSA management module 301, the number of storage application program
The storage location information of the data of storage according to afterwards, is returned to VSA management module 301.VSA
Management module 301 is by the mark of this data and the storage location information corresponding record of this data.
More excellent, the storage location information identifying with this data of the data based on corresponding record, this
Bright embodiment additionally provides a kind of data access method of distributed file system, and the flow process of the method is shown
It is intended to as shown in figure 5, comprising the steps:
S501:After VSA management module 301 receives the access request of application program, therefrom parse
Go out the mark of data to be visited, and determine the application class of the application program sending this access request
Type.
Specifically, VSA management module 301 receives the access request of application program by YARN
Afterwards, the mark of data and the mark of application program are therefrom parsed;Answering in corresponding record in advance
With, in the mark of program and application type, finding out the application type corresponding with the mark parsing,
The application type finding out is defined as send the application type of the application program of access request.
S502:The storage position of the corresponding data of mark of VSA management module 301 and then determination data
Confidence ceases, and the corresponding VSA302 of application type.
Specifically, the mark of VSA management module 301 data of corresponding record from above-mentioned steps S403
Know in the storage location information with this data, find out the data parsing in above-mentioned steps S501
The storage location information of the corresponding data of mark.
VSA management module 301 is for the application class of the application program determined in above-mentioned steps S501
Type, and then determine the concrete grammar of the corresponding VSA302 of this application type, with above-mentioned steps S402
Middle VSA management module 301 determines that the concrete grammar of the corresponding VSA302 of application type is identical,
Here is omitted.
S503:VSA manages the storage location information according to the data determined for the mould 301, and really
In the VSA302 making, the type of storage server of configuration, the setting area of storage server obtain
Data returns to application program.
Specifically, VSA management module 301 is for the VSA302 determining in above-mentioned steps S502,
From the engine program of corresponding record in advance and the type of storage server, determine with this VSA in
The corresponding engine program of type of the storage server of configuration.
VSA management module 301 calls the engine program determined, according to true in above-mentioned steps S502
The storage location information of the data made, from the VSA302 determining, the setting area of configuration deposits
In storage server, after obtaining data to be visited, by the data obtaining to the application sending access request
Program returns.
More excellent, the date storage method based on above-mentioned distributed file system and access method, this
The block schematic illustration of the internal structure of VSA management module 301 in bright embodiment as shown in fig. 6,
Including:Engine wrapper 601 and storage configuration manager 602.
Wherein, it is packaged with multiple engine programs in engine wrapper 601, each engine program is respectively used to
The data storage configuring different types of storage server in each VSA accesses.
After storage configuration manager 602 is asked for the storage receiving application program, determine that this should
With the application type of program, and then determine the corresponding VSA302 of this application type;Engine is called to seal
The engine corresponding with the type of the storage server of configuration in the VSA302 determining in dress device 601
Program, data to be stored for application program stores the storage clothes of the setting area in this VSA
In business device.
It is preferred that storage configuration manager 602 is additionally operable to determine the corresponding VSA302 of application type
Afterwards, obtain the storage strategy being directed to this VSA302 setting in advance;And call with this VSA in join
The corresponding engine program of the type of the storage server put, for application program data to be stored,
The data trnascription number specified according to the data redundancy substrategy in the storage strategy obtaining, carries out multiple
System;Each data trnascription that duplication is obtained, stores the storage service of the setting area in this VSA
In device.
It is preferred that storage configuration manager 301 is additionally operable to call the storage clothes with configuration in this VSA
The corresponding engine program of type of business device, data to be stored for application program stores VSA
In the storage server of setting area in after, receive the data storage of this engine program return
Storage location information, by the storage location information corresponding record of the mark of data storage and this data.
More excellent, as shown in fig. 6, VSA management module 301 also includes:Data access scheduler
603.
After data access scheduler 603 is used for receiving the access request of application program, therefrom parse
The mark of data to be visited, and determine the application type of the application program sending this access request;
And then determine the storage location information of mark this data corresponding of data, and this application type pair
The VSA302 answering;According to the storage location information of the data determined, and configure in VSA302
The type of storage server, the setting area of storage server, obtain data to be visited, Xiang Fa
The application program sending access request returns.
Specifically, data access scheduler 603, for the VSA302 determining, is remembered from corresponding in advance
In the type of the engine program of record and storage server, determine the storage clothes with configuration in this VSA
The corresponding engine program of type of business device.
Data access scheduler 603 calls the engine program determined, root from engine wrapper 601
According to the storage location information of the data determined, the setting area of configuration from the VSA302 determining
Storage server in, after obtaining data to be visited, by the data obtaining to sending access request
Application program returns.
Above-mentioned engine wrapper 601, storage configuration manager 602 data access scheduling device 603 work(
The implementation method of energy, may be referred to the particular content of the method and step flow process as shown in above-mentioned Fig. 4-5,
Here is omitted.
In the embodiment of the present invention, the multiple virtual storage region VSA in distributed file system are respectively
The different application type of correspondence;VSA management module in distributed file system is according to application program
Application type, determine VSA corresponding with application type, by the data of application program, according to
In VSA, the type of storage server of configuration and setting area are stored.As can be seen here, the present invention
In embodiment, the data of different types of application program can be stored in different setting areas and type
Storage server in, the distributed file system in the embodiment of the present invention can meet all types of should
With the needs to data storage for the program.
And, in the embodiment of the present invention, also it is previously provided with respective storage respectively for each VSA
Strategy.Storage strategy is actually corresponding with application type, according to the application program of different application type
Data, the data trnascription copying corresponding number is stored;With the data for each application program,
Replicated and stored according to same data trnascription data and compared, the embodiment of the present invention meet all types of
Application program to the needs of data backup on the basis of, it is possible to reduce the total number of data trnascription, thus
Save memory space, reduces cost.
Further, in the embodiment of the present invention, for the VSA corresponding with cold data application type, pin
Energy-conservation substrategy is also included in storage strategy to this VSA setting;VSA management module can basis
This energy-conservation substrategy, carries out power-on and power-off to the mechanical hard disk being stored with the storage server of cold data,
To save electric energy, further reduces cost.
Additionally, in the embodiment of the present invention, being independent mutually between VSA.Therefore, the present invention is implemented
New VSA can easily be added in distributed file system in example, and join for new VSA
Put new application type, the type (for example there is new storage medium) of new storage server or
The setting area of the new storage server of person, thus have good autgmentability.
Those skilled in the art of the present technique are appreciated that the present invention includes relating to execute institute in the application
State the equipment of one or more in operation.These equipment specially can design for required purpose and
Manufacture, or the known device in general purpose computer can also be included.These equipment have and are stored in it
Interior computer program, these computer programs optionally activate or reconstruct.Such computer journey
Sequence can be stored in equipment (for example, computer) computer-readable recording medium or be stored in and be suitable to storage electricity
Sub-instructions are simultaneously coupled in any kind of medium of bus respectively, and described computer-readable medium includes
But be not limited to any kind of disk (including floppy disk, hard disk, CD, CD-ROM and magneto-optic disk),
ROM (Read-Only Memory, read only memory), RAM (Random Access Memory,
Memorizer immediately), (Erasable Programmable Read-Only Memory, can for EPROM
Erasable programmable read only memory), EEPROM (Electrically Erasable Programmable
Read-Only Memory, EEPROM), flash memory, magnetic card or light
Card.It is, computer-readable recording medium includes being deposited in the form of can reading by equipment (for example, computer)
Storage or any medium of transmission information.
Those skilled in the art of the present technique are appreciated that and can realize these knots with computer program instructions
Each frame in composition and/or block diagram and/or flow graph and these structure charts and/or block diagram and/or flow graph
In frame combination.Those skilled in the art of the present technique are appreciated that and can refer to these computer programs
The processor that order is supplied to general purpose computer, special purpose computer or other programmable data processing methods comes
Realize, thus the present invention is executed by the processor of computer or other programmable data processing methods
The scheme specified in the frame of disclosed structure chart and/or block diagram and/or flow graph or multiple frame.
Those skilled in the art of the present technique are appreciated that various operations, the side having discussed in the present invention
Step in method, flow process, measure, scheme can be replaced, changed, combined or deleted.Further
Ground, has the various operations having discussed in the present invention, method, other steps in flow process, arranges
Apply, scheme can also be replaced, changes, resets, decomposes, combines or deletes.Further, existing
Have in technology have with the step in the various operations disclosed in the present invention, method, flow process, measure,
Scheme can also be replaced, changed, reset, decomposed, combined or deleted.
The above is only some embodiments of the present invention it is noted that for the art
For those of ordinary skill, under the premise without departing from the principles of the invention, some improvement can also be made
And retouching, these improvements and modifications also should be regarded as protection scope of the present invention.
Claims (12)
1. a kind of date storage method of distributed file system is it is characterised in that include:
After virtual storage region VSA management module receives the storage request of application program, determine
The application type of described application program;
Described VSA management module and then determine the corresponding VSA of described application type;
Described VSA management module by the data of described application program, according to configuration in described VSA
The type of storage server, the setting area of storage server are stored.
2. method according to claim 1 is it is characterised in that the application of described application program
One of specifically following application type of type:
Offline batch application type, computation-intensive application on site type, cold data application type,
Input and output I/O intensive type application on site type, real-time application type.
3. method according to claim 2 is it is characterised in that described computation-intensive is online
Application type is corresponding with a VSA;In first VSA, the type of the storage server of configuration is machinery
Hard disk server type;And
Described offline batch application type is corresponding with the 2nd VSA;The storage of configuration in 2nd VSA
The type of server is mechanical hard disk type of server;And
Described cold data application type is corresponding with the 3rd VSA;The storage service of configuration in 3rd VSA
The type of device is mechanical hard disk type of server;And
Described I/O intensive type application on site type is corresponding with the 4th VSA;Configuration in 4th VSA
The type of storage server is solid state hard disc or caching server type;And
Described real-time application type is corresponding with the 5th VSA;The storage server of configuration in 5th VSA
Type be Large Copacity inner server type.
4. the method according to any one of claim 1-3 is it is characterised in that described in described basis
In VSA, the type of storage server of configuration, the setting area of storage server are stored, specifically
Including:
Described VSA management module determines the type phase with the storage server of configuration in described VSA
Corresponding engine program;
Call the engine program determined by data to be stored for described application program, store described setting
Put in the storage server in region.
5. method according to claim 4 is it is characterised in that described determine described application
After the corresponding VSA of type, also include:
Described VSA management module obtains the storage strategy being directed to described VSA setting in advance;And
Described engine program, by data to be stored for described application program, stores described setting area
In storage server, specifically include:
Described VSA management module calls described engine program, to be stored for described application program
Data, the data trnascription number specified according to the data redundancy substrategy in described storage strategy, carry out
Replicate;And each data trnascription obtaining will be replicated, store in the storage server of described setting area.
6. method according to claim 4 is it is characterised in that described engine program will be described
Application program data to be stored, after storing in the storage server of described setting area, also wraps
Include:
Described engine program returns the storage location information of described data;
Described VSA management module will be corresponding with the storage location information of this data for the mark of described data
Record.
7. method according to claim 6 is it is characterised in that also include:
After described VSA management module receives the access request of application program, therefrom parse and wait to visit
The mark of the data asked, and determine the application type of the application program sending this access request;
The storage of the corresponding described data of mark of described VSA management module and then the described data of determination
Positional information, and the corresponding VSA of described application type;
Described VSA management module according to the storage location information of described data, and in described VSA
The type of storage server of configuration, the setting area of storage server obtain described data and answer to described
Returned with program.
8. a kind of data storage device of distributed file system is it is characterised in that include:
Multiple virtual storage region VSA, each VSA correspond to different application types respectively;
VSA management module, after the storage for receiving application program is asked, determines described application
The application type of program, and then determine the corresponding VSA of described application type;By described application program
Data, according to the setting of the type of storage server of configuration, storage server in described VSA
Region is stored.
9. device according to claim 8 is it is characterised in that described VSA management module,
Including:
Engine wrapper, is wherein packaged with multiple engine programs, is respectively used to the storage of variant type
The data storage of server accesses;
Storage configuration manager, after the storage for receiving application program is asked, determines described answering
With the application type of program, and then determine the corresponding VSA of described application type;Call described engine
The engine program corresponding with the type of the storage server of configuration in described VSA in wrapper, will
Described application program data to be stored, stores in the storage server of described setting area.
10. device according to claim 9 it is characterised in that
After described storage configuration manager is additionally operable to determine the corresponding VSA of described application type, obtain
Take the storage strategy being directed to described VSA setting in advance;And call with described VSA in configuration deposit
The corresponding engine program of type of storage server, for described application program data to be stored, according to
The data trnascription number specified according to the data redundancy substrategy in described storage strategy, is replicated;Will
Replicate each data trnascription obtaining, store in the storage server of described setting area.
11. devices according to claim 10 it is characterised in that
Described storage configuration manager is additionally operable to call described engine program to wait to deposit by described application program
The data of storage, after storing in the storage server of described setting area, receives described engine program
The storage location information of the described data returning, by the storage location of the mark of described data and this data
Information corresponding record.
12. devices according to claim 11 it is characterised in that described VSA management module,
Also include:
Data access scheduler, for receiving after the access request of application program, therefrom parses and treats
The mark of the data accessing, and determine the application type of the application program sending this access request;Enter
And determine the storage location information of the corresponding described data of mark of described data, and described application
The corresponding VSA of type;According to the storage location information of described data, and configure in described VSA
The type of storage server, storage server setting area obtain described data to described application journey
Sequence returns.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510532946.7A CN106484712A (en) | 2015-08-27 | 2015-08-27 | The date storage method of distributed file system and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510532946.7A CN106484712A (en) | 2015-08-27 | 2015-08-27 | The date storage method of distributed file system and device |
Publications (1)
Publication Number | Publication Date |
---|---|
CN106484712A true CN106484712A (en) | 2017-03-08 |
Family
ID=58234563
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201510532946.7A Pending CN106484712A (en) | 2015-08-27 | 2015-08-27 | The date storage method of distributed file system and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106484712A (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109302382A (en) * | 2018-08-29 | 2019-02-01 | 山东超越数控电子股份有限公司 | A kind of construction method and system of polynary isomery storage service management platform |
CN111787008A (en) * | 2020-06-30 | 2020-10-16 | 北京指掌易科技有限公司 | Access control method, device, electronic equipment and computer readable storage medium |
CN113835616A (en) * | 2020-06-23 | 2021-12-24 | 华为技术有限公司 | Applied data management method, system and computer device |
CN114579560A (en) * | 2020-12-01 | 2022-06-03 | 中移(苏州)软件技术有限公司 | Data platform and application method thereof |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN202634482U (en) * | 2012-03-08 | 2012-12-26 | 西安跃腾电子科技有限责任公司 | Core configuration of college cloud calculation common information service platform and system application |
CN102938784A (en) * | 2012-11-06 | 2013-02-20 | 无锡江南计算技术研究所 | Method and system used for data storage and used in distributed storage system |
CN103593262A (en) * | 2013-11-15 | 2014-02-19 | 上海爱数软件有限公司 | Virtual machine backup method based on classification |
CN103853633A (en) * | 2014-02-14 | 2014-06-11 | 上海爱数软件有限公司 | Application program injection type backup method based on operation information application discovery of virtual machine |
CN104598495A (en) * | 2013-10-31 | 2015-05-06 | 南京中兴新软件有限责任公司 | Hierarchical storage method and system based on distributed file system |
CN104782134A (en) * | 2012-11-15 | 2015-07-15 | 日本电气株式会社 | Server device, terminal, thin client system, screen transmission method and program |
-
2015
- 2015-08-27 CN CN201510532946.7A patent/CN106484712A/en active Pending
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN202634482U (en) * | 2012-03-08 | 2012-12-26 | 西安跃腾电子科技有限责任公司 | Core configuration of college cloud calculation common information service platform and system application |
CN102938784A (en) * | 2012-11-06 | 2013-02-20 | 无锡江南计算技术研究所 | Method and system used for data storage and used in distributed storage system |
CN104782134A (en) * | 2012-11-15 | 2015-07-15 | 日本电气株式会社 | Server device, terminal, thin client system, screen transmission method and program |
CN104598495A (en) * | 2013-10-31 | 2015-05-06 | 南京中兴新软件有限责任公司 | Hierarchical storage method and system based on distributed file system |
CN103593262A (en) * | 2013-11-15 | 2014-02-19 | 上海爱数软件有限公司 | Virtual machine backup method based on classification |
CN103853633A (en) * | 2014-02-14 | 2014-06-11 | 上海爱数软件有限公司 | Application program injection type backup method based on operation information application discovery of virtual machine |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109302382A (en) * | 2018-08-29 | 2019-02-01 | 山东超越数控电子股份有限公司 | A kind of construction method and system of polynary isomery storage service management platform |
CN113835616A (en) * | 2020-06-23 | 2021-12-24 | 华为技术有限公司 | Applied data management method, system and computer device |
WO2021258881A1 (en) * | 2020-06-23 | 2021-12-30 | 华为技术有限公司 | Data management method and system for application, and computer device |
CN113835616B (en) * | 2020-06-23 | 2025-06-13 | 华为技术有限公司 | Application data management method, system and computer device |
CN111787008A (en) * | 2020-06-30 | 2020-10-16 | 北京指掌易科技有限公司 | Access control method, device, electronic equipment and computer readable storage medium |
CN111787008B (en) * | 2020-06-30 | 2023-01-20 | 北京指掌易科技有限公司 | Access control method, device, electronic equipment and computer readable storage medium |
CN114579560A (en) * | 2020-12-01 | 2022-06-03 | 中移(苏州)软件技术有限公司 | Data platform and application method thereof |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN102246489B (en) | Systems and methods for connection management for asynchronous messaging over http | |
EP2648114B1 (en) | Method, system, token conreoller and memory database for implementing distribute-type main memory database system | |
CN101626398B (en) | Method for obtaining friend dynamic and system | |
CN111258978B (en) | Data storage method | |
WO2010072083A1 (en) | Web application based database system and data management method therof | |
CN110336871A (en) | A kind of document handling method, device, storage medium and electronic equipment | |
CN102217273A (en) | Systems and methods for application fluency policies | |
CN105393243A (en) | Transaction ordering | |
CN110213352A (en) | The unified Decentralized Autonomous storage resource polymerization of name space | |
CN106484712A (en) | The date storage method of distributed file system and device | |
CN107888666A (en) | A kind of cross-region data-storage system and method for data synchronization and device | |
CN101771723A (en) | Data synchronization method | |
CN104156300A (en) | Log management system and log management method | |
CN100536472C (en) | A method, module and server to control access to network resource | |
CN106933868A (en) | A kind of method and data server for adjusting data fragmentation distribution | |
CN113885797B (en) | Data storage method, device, equipment and storage medium | |
Mortazavi et al. | Sessionstore: A session-aware datastore for the edge | |
CN101789963A (en) | Data synchronization system | |
CN117938863B (en) | Cluster-based joint simulation implementation method, system, equipment and storage medium | |
CN103442034A (en) | Streaming media service method and system based on cloud computing technology | |
US7693840B1 (en) | Method and system for distribution of common elements | |
CN103389986A (en) | Method, device and system for storing and searching Session information | |
Nath et al. | IrisNet: An architecture for compute-intensive wide-area sensor network services | |
CN209765499U (en) | media fusion technology platform system based on APP | |
CN107528797B (en) | Data processing method, device and system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20170308 |
|
RJ01 | Rejection of invention patent application after publication |