CN112235356B - Distributed PB-level CFD simulation data management system based on cluster - Google Patents
Distributed PB-level CFD simulation data management system based on cluster Download PDFInfo
- Publication number
- CN112235356B CN112235356B CN202011007979.7A CN202011007979A CN112235356B CN 112235356 B CN112235356 B CN 112235356B CN 202011007979 A CN202011007979 A CN 202011007979A CN 112235356 B CN112235356 B CN 112235356B
- Authority
- CN
- China
- Prior art keywords
- data
- file
- storage
- module
- client
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/02—Protocols based on web technology, e.g. hypertext transfer protocol [HTTP]
- H04L67/025—Protocols based on web technology, e.g. hypertext transfer protocol [HTTP] for remote control or remote monitoring of applications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F30/00—Computer-aided design [CAD]
- G06F30/20—Design optimisation, verification or simulation
- G06F30/28—Design optimisation, verification or simulation using fluid dynamics, e.g. using Navier-Stokes equations or computational fluid dynamics [CFD]
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/06—Protocols specially adapted for file transfer, e.g. file transfer protocol [FTP]
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/08—Protocols specially adapted for terminal emulation, e.g. Telnet
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/10—Protocols in which an application is distributed across nodes in the network
- H04L67/1097—Protocols in which an application is distributed across nodes in the network for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS]
Landscapes
- Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Mathematical Analysis (AREA)
- Fluid Mechanics (AREA)
- Computing Systems (AREA)
- Mathematical Optimization (AREA)
- Mathematical Physics (AREA)
- Pure & Applied Mathematics (AREA)
- Computer Hardware Design (AREA)
- Evolutionary Computation (AREA)
- Geometry (AREA)
- General Engineering & Computer Science (AREA)
- Algebra (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a distributed PB-level CFD simulation data management system based on a cluster. The invention relates to the technical field of simulation data management, and the system hardware of the invention is mainly divided into two parts, namely a client and a server, which are connected with the Internet through a router. The client side is a man-machine port which directly interacts with the user and is responsible for sending the instruction of the user to the server side and presenting the result returned by the server side to the user. According to the invention, even if a certain part of data is lost, other data can still be used, the data generation (calculation) and storage are completed in the cluster, the resources of the cluster machine are fully utilized, large-scale data is divided into small data, and the idle resources of the disk can be effectively utilized. The data access method and the data access system can support a plurality of engineers to access the data at the same time, reduce data circulation time, enable the data to be accessed immediately after being calculated, and are high in timeliness.
Description
Technical Field
The invention relates to the technical field of simulation data management, in particular to a distributed PB-level CFD simulation data management system based on a cluster.
Background
With the improvement of computer computing power, especially the application of supercomputers such as light of Tianhe and Taihu lake in the CFD field, the data volume of CFD simulation results is rapidly increased, and the data volume reaching TB level through single calculation is gradually changed into a normal state. The huge data volume exceeds the storage capacity of a single computer, and the invention provides a distributed PB-level CFD simulation result data management system always based on a cluster. On one hand, an efficient data storage and retrieval method is provided, and on the other hand, idle storage resources of the cluster can be fully utilized.
In a scientific research institute or an enterprise, when engineering-level CFD simulation analysis is performed, in order to increase analysis speed, an ultra-computation or private computation cluster is often used, and the number of available computation cores is usually from several hundred to thousands of cores. When the cluster carries out analysis and calculation, each node sends a calculation result to one or more fixed nodes, and the nodes combine data and store the data in the hard disk of the nodes. In this case, only a few node storage resources can be used, and the storage resources of other nodes are not fully utilized.
At present, in CFD calculation, result data is often large, on one hand, the data amount of a single file is large, usually hundreds of MB, even more than 1GB, and on the other hand, the number of time sequence files in single calculation is large, which can reach hundreds of thousands. In research institutions and enterprises, special private clusters are usually used for CFD simulation calculation, a simulation engineer usually copies and takes away results by using a single mobile storage device, or the results are analyzed after downloading the results in a local area network, the time efficiency of downloading the results from the local area network is usually very short, and result files can be quickly covered by other calculations.
The simulation result of the CFD is divided into two parts, one part is a mesh part for representing the geometric shape, and the other part is an attribute data part for representing the physical characteristics. The grid part is divided into two parts of nodes and units (connection relation of the nodes). Attribute data can be divided into three categories: scalar, vector and tensor. Meanwhile, the result in the CFD simulation analysis is usually a timing result, that is, a simulation calculation result is output every time step.
Disclosure of Invention
In order to solve the problems in the prior art, the invention provides a distributed PB-level CFD simulation data management system based on a cluster, and the invention provides the following technical scheme:
a distributed PB-level CFD simulation data management system based on a cluster comprises a client 1, a client 2, a router 1, a router 2, a server, a central switch, a storage node 1, a storage node 2, a storage node 3 and a storage node 4;
the client 1 and the client 2 are connected with a router 1, the router 1 is connected with the router 2 through the Internet, the router 2 is connected with a server, the router 2 is connected with a central switch, and the central switch is connected with a storage node 1, a storage node 2, a storage node 3 and a storage node 4;
the client 1 and the client 2 comprise a TCP/IP service module, a data management module, a control center, a service function module and a user interaction interface;
the TCP/IP service module is connected with the data management module and the control center, the data management module is connected with the control center, the control center is connected with the service function module, and the control center and the service function module are connected with the user interaction interface.
Preferably, a management system is deployed on the server to realize management and maintenance of the whole system cluster, the server needs to complete instruction response to the client, and files required by the instruction are combined and then transmitted to the specified client;
the storage node is the final storage end of the data, that is, all the data will be stored in the hard disks of different storage nodes in a distributed manner.
Preferably, the management system stores the result file in a distributed storage mode, splits the result file into smaller data units based on the data characteristics of the CFD simulation result file, and then performs distributed storage.
Preferably, the result file is split in a grid-attribute mode, and a mapping relation database is established to ensure that the split result is retrieved and merged again; for a single CFD result file, the single CFD result file comprises grid data and physical attribute data, the grid data is divided into grid node sequences and unit topology data, the grid node sequences and the unit topology data are respectively stored in different storage nodes, and storage paths are stored in a mapping relation database; or establishing a physical attribute list, splitting and storing the physical attribute data according to a scalar, a vector and a tensor, and storing the storage path into a mapping relation database.
Preferably, after the splitting process, a single file is split into a plurality of small data files and stored in different storage nodes, and the data at the example level needs to be maintained and managed again, so that forward search according to the example and the time sequence is guaranteed.
Preferably, the TCP/IP service module realizes the connection and instruction transmission with the server end through network port mapping, and downloads the file transmitted by the server end to the local through an FTP protocol;
the data management module is used for managing the downloaded file and analyzing and managing the file;
the control center is the core of the whole client software, realizes the processing and forwarding of user instructions, and organizes and coordinates data to realize the functional instructions of the software;
the business function module realizes the core business function and mainly comprises the functions of data processing and data visualization;
the user interaction interface is a software window directly operated by a user and is the foremost end of human-computer interaction.
Preferably, the server end mainly realizes the storage and reading of data, and the TCP/IP service is connected with the client end and the storage node end through a grid port to carry out instruction transmission and file transmission; the file storage module is used for decomposing the result file, storing the decomposed file into a storage node and storing a storage path into a database;
the file reading module collects and combines files stored in different nodes into a complete file according to the requirement of the instruction, and transmits the complete file to the client;
the database management module realizes the unified maintenance of the database, adopts the MySQL database, and encapsulates the basic operations of adding, deleting, modifying and checking the database.
Preferably, the storage node is responsible for uploading and downloading management of local files of the storage node, the TCP/IP service is connected with the server side software through a network port to realize instruction transmission and file transmission, the file storage realizes localized storage of the software transmitted from the server side, the resource maintenance mainly realizes management of local slave storage resources, addition and deletion operations are performed on the files, information such as file numbers are inquired at the same time, and the file uploading realizes uploading of the locally stored files to the server side.
The invention has the following beneficial effects:
the system hardware of the invention is mainly divided into two parts, namely a client and a server, which are connected with the Internet through a router.
The client side is a man-machine port which directly interacts with the user and is responsible for sending the instruction of the user to the server side and presenting the result returned by the server side to the user.
The server is the core of the whole system. The server side is composed of hardware and comprises a router, a server and a plurality of storage nodes connected through a switch. The management system is deployed on the server to realize management and maintenance of the whole system cluster, and the storage nodes are final data storage ends, namely all data can be stored in hard disks of different storage nodes in a distributed manner. Functionally, the server needs to complete the instruction response to the client, and combine the files required by the instruction and transmit the combined files to the specified client. Distributed storage is adopted, data is safer, other data can still be used even if a certain part of data is lost, data generation (calculation) and storage are completed in a cluster, resources of a cluster machine are fully utilized, large-scale data are divided into small data, and idle resources of a disk can be effectively utilized. The data access method and the data access system can support a plurality of engineers to access the data at the same time, reduce data circulation time, enable the data to be accessed immediately after being calculated, and are high in timeliness.
Drawings
FIG. 1 is a block diagram of a cluster-based distributed PB-level CFD simulation data management system;
FIG. 2 is a diagram of a client architecture;
FIG. 3 is a diagram of a server side architecture;
FIG. 4 is a storage node side architecture diagram;
FIG. 5 is a file storage flow diagram;
FIG. 6 is a flowchart of file query.
Detailed Description
The present invention will be described in detail with reference to specific examples.
The first embodiment is as follows:
as shown in fig. 1 to fig. 6, the present invention provides a cluster-based distributed PB-level CFD simulation data management system, which specifically includes:
a distributed PB-level CFD simulation data management system based on a cluster comprises a client 1, a client 2, a router 1, a router 2, a server, a central switch, a storage node 1, a storage node 2, a storage node 3 and a storage node 4;
the client 1 and the client 2 are connected with a router 1, the router 1 is connected with the router 2 through the Internet, the router 2 is connected with a server, the router 2 is connected with a central switch, and the central switch is connected with a storage node 1, a storage node 2, a storage node 3 and a storage node 4;
the client 1 and the client 2 comprise a TCP/IP service module, a data management module, a control center, a service function module and a user interaction interface;
the TCP/IP service module is connected with the data management module and the control center, the data management module is connected with the control center, the control center is connected with the service function module, and the control center and the service function module are connected with the user interaction interface.
The server is provided with a management system to realize management and maintenance of the whole system cluster, and the server needs to complete instruction response to the client and transmit files required by the instruction to the specified client after combining;
the storage node is the final storage end of the data, that is, all the data will be stored in the hard disks of different storage nodes in a distributed manner.
The management system stores the result file in a distributed storage mode, splits the result file into smaller data units based on the data characteristics of the CFD simulation result file, and then performs distributed storage.
The result file is split in a grid-attribute mode, and meanwhile, a mapping relation database is established to ensure that the split result is retrieved and merged again; the resolution is shown in table 1 below. As described above, for a single CFD result file, which contains mesh data and physical attribute data, the mesh data is split into a mesh node sequence and unit topology data, and stored in different storage nodes, and storage paths are stored in a mapping relation database; or establishing a physical attribute list, splitting and storing the physical attribute data according to a scalar, a vector and a tensor, and storing the storage path into a mapping relation database.
TABLE 1
After the splitting processing, a single file is split into a plurality of small data files and stored on different storage nodes, and the data at the example level needs to be maintained and managed again according to the form of the table 2, so that forward search according to the example and the time sequence is guaranteed.
TABLE 2
The TCP/IP service module realizes the connection and instruction transmission with the server end through the mapping of the network port and realizes the downloading of the file transmitted by the server end to the local through the FTP protocol;
the data management module is used for managing the downloaded file and analyzing and managing the file;
the control center is the core of the whole client software, realizes the processing and forwarding of user instructions, and organizes and coordinates data to realize the functional instructions of the software;
the business function module realizes the core business function and mainly comprises the functions of data processing and data visualization;
the user interaction interface is a software window directly operated by a user and is the foremost end of human-computer interaction.
The server mainly realizes the storage and reading of data, and the TCP/IP service is connected with the client and the storage node end through the grid port to transmit instructions and transmit files; the file storage module is used for decomposing the result file, storing the decomposed file into a storage node and storing a storage path into a database;
the file reading module collects and combines files stored in different nodes into a complete file according to the requirement of the instruction, and transmits the complete file to the client;
the database management module realizes the unified maintenance of the database, adopts the MySQL database, and encapsulates the basic operations of adding, deleting, modifying and checking the database.
The storage node is responsible for uploading and downloading management of local files of the storage node, the TCP/IP service is connected with server-side software through a network port to realize instruction transmission and file transmission, file storage realizes the localized storage of the software transmitted from the server side, resource maintenance mainly realizes the management of local slave storage resources, adds and deletes files, inquires information such as file numbers and the like, and the file uploading realizes the uploading of the files stored locally to the server side.
The above description is only a preferred embodiment of the distributed PB-level CFD simulation data management system based on the cluster, and the protection scope of the distributed PB-level CFD simulation data management system based on the cluster is not limited to the above embodiments, and all technical solutions belonging to the idea belong to the protection scope of the present invention. It should be noted that modifications and variations which do not depart from the gist of the invention will be those skilled in the art to which the invention pertains and which are intended to be within the scope of the invention.
Claims (6)
1. A distributed PB-level CFD simulation data management system based on clusters is characterized in that: the management system comprises a client 1, a client 2, a router 1, a router 2, a server, a central switch, a storage node 1, a storage node 2, a storage node 3 and a storage node 4;
the client 1 and the client 2 are connected with a router 1, the router 1 is connected with the router 2 through the Internet, the router 2 is connected with a server, the router 2 is connected with a central switch, and the central switch is connected with a storage node 1, a storage node 2, a storage node 3 and a storage node 4;
the client 1 and the client 2 comprise a TCP/IP service module, a data management module, a control center, a service function module and a user interaction interface;
the TCP/IP service module is connected with a data management module and a control center, the data management module is connected with the control center, the control center is connected with a service function module, and the control center and the service function module are connected with a user interaction interface;
the management system stores the result file in a distributed storage mode, splits the result file into smaller data units based on the data characteristics of the CFD simulation result file, and then performs distributed storage;
splitting the result file in a grid-attribute mode, and simultaneously establishing a mapping relation database to ensure that the split result is retrieved and merged again; for a single CFD result file, the single CFD result file comprises grid data and physical attribute data, the grid data is divided into grid node sequences and unit topology data, the grid node sequences and the unit topology data are respectively stored in different storage nodes, and storage paths are stored in a mapping relation database; or establishing a physical attribute list, splitting and storing the physical attribute data according to a scalar, a vector and a tensor, and storing the storage path into a mapping relation database.
2. The distributed cluster-based PB-level CFD simulation data management system of claim 1, wherein: the server is provided with a management system to realize management and maintenance of the whole system cluster, and the server needs to complete instruction response to the client and transmit files required by the instruction to the specified client after combining;
the storage node is the final storage end of the data, that is, all the data will be stored in the hard disks of different storage nodes in a distributed manner.
3. The distributed cluster-based PB-level CFD simulation data management system of claim 1, wherein: after the splitting processing is carried out on a single file, the single file is split into a plurality of small data files to be stored on different storage nodes, the data at the example level needs to be maintained and managed again, and forward searching according to the example and the time sequence is guaranteed to be achieved.
4. The distributed cluster-based PB-level CFD simulation data management system of claim 1, wherein: the TCP/IP service module realizes the connection and instruction transmission with the server end through the mapping of the network port and realizes the downloading of the file transmitted by the server end to the local through the FTP protocol;
the data management module is used for managing the downloaded file and analyzing and managing the file;
the control center is the core of the whole client software, realizes the processing and forwarding of user instructions, and organizes and coordinates data to realize the functional instructions of the software;
the business function module realizes the core business function and mainly comprises the functions of data processing and data visualization;
the user interaction interface is a software window directly operated by a user and is the foremost end of human-computer interaction.
5. The distributed cluster-based PB-level CFD simulation data management system of claim 1, wherein: the server side realizes the storage and reading of data, and comprises a TCP/IP service module, a file storage module, a file reading module and a database management module, wherein the TCP/IP service module is connected with the client side and the storage node side through a grid port to transmit instructions and transmit files; the file storage module is used for decomposing the result file, storing the decomposed file into a storage node and storing a storage path into a database;
the file reading module collects and combines files stored in different nodes into a complete file according to the requirement of the instruction, and transmits the complete file to the client;
the database management module realizes the unified maintenance of the database, adopts the MySQL database, and encapsulates the basic operations of adding, deleting, modifying and checking the database.
6. The distributed cluster-based PB-level CFD simulation data management system of claim 1, wherein: the storage node is responsible for uploading and downloading management of local files of the storage node, and comprises a TCP/IP service module, a file storage module, a resource maintenance module and a file uploading module, wherein the TCP/IP service module is connected with server-side software through a network port to realize instruction transmission and file transmission, the file storage module realizes the localized storage of the software transmitted from the server side, the resource maintenance module realizes the management of local storage resources, the addition and deletion operation of files are carried out, meanwhile, the number information of the files is inquired, and the file uploading module realizes the uploading of the files stored locally to the server side.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011007979.7A CN112235356B (en) | 2020-09-23 | 2020-09-23 | Distributed PB-level CFD simulation data management system based on cluster |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011007979.7A CN112235356B (en) | 2020-09-23 | 2020-09-23 | Distributed PB-level CFD simulation data management system based on cluster |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112235356A CN112235356A (en) | 2021-01-15 |
CN112235356B true CN112235356B (en) | 2021-09-07 |
Family
ID=74108611
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011007979.7A Active CN112235356B (en) | 2020-09-23 | 2020-09-23 | Distributed PB-level CFD simulation data management system based on cluster |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112235356B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115481539B (en) * | 2022-09-29 | 2023-06-06 | 成都安世亚太科技有限公司 | Simulation result data rapid analysis and storage method |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102841854A (en) * | 2011-05-20 | 2012-12-26 | 国际商业机器公司 | Method and system for executing data reading based on dynamic hierarchical memory cache (hmc) awareness |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106446099A (en) * | 2016-09-13 | 2017-02-22 | 国家超级计算深圳中心(深圳云计算中心) | Distributed cloud storage method and system and uploading and downloading method thereof |
CN107329982A (en) * | 2017-06-01 | 2017-11-07 | 华南理工大学 | A kind of big data parallel calculating method stored based on distributed column and system |
US11470146B2 (en) * | 2018-08-25 | 2022-10-11 | Panzura, Llc | Managing a cloud-based distributed computing environment using a distributed database |
US11178246B2 (en) * | 2018-08-25 | 2021-11-16 | Panzura, Llc | Managing cloud-based storage using a time-series database |
CN110378037B (en) * | 2019-07-23 | 2022-08-19 | 苏州浪潮智能科技有限公司 | CFD simulation data storage method and device based on Ceph and server |
-
2020
- 2020-09-23 CN CN202011007979.7A patent/CN112235356B/en active Active
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102841854A (en) * | 2011-05-20 | 2012-12-26 | 国际商业机器公司 | Method and system for executing data reading based on dynamic hierarchical memory cache (hmc) awareness |
Also Published As
Publication number | Publication date |
---|---|
CN112235356A (en) | 2021-01-15 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US8713182B2 (en) | Selection of a suitable node to host a virtual machine in an environment containing a large number of nodes | |
CN109815283B (en) | Heterogeneous data source visual query method | |
CN104484472B (en) | A kind of data-base cluster and implementation method of a variety of heterogeneous data sources of mixing | |
US10908834B2 (en) | Load balancing for scalable storage system | |
US8812645B2 (en) | Query optimization in a parallel computer system with multiple networks | |
CN111966677A (en) | Data report processing method and device, electronic equipment and storage medium | |
Sun et al. | Survey of distributed computing frameworks for supporting big data analysis | |
CN113760453B (en) | Container mirror image distribution system and container mirror image pushing, pulling and deleting method | |
CN110851234A (en) | Log processing method and device based on docker container | |
CN113312283A (en) | Heterogeneous image learning system based on FPGA acceleration | |
US20240004853A1 (en) | Virtual data source manager of data virtualization-based architecture | |
US11960616B2 (en) | Virtual data sources of data virtualization-based architecture | |
CN104166661A (en) | Data storage system and method | |
KR20220026603A (en) | File handling methods, devices, electronic devices and storage media | |
CN112235356B (en) | Distributed PB-level CFD simulation data management system based on cluster | |
CN111291893B (en) | Scheduling method, scheduling system, storage medium and electronic device | |
US11593310B2 (en) | Providing writable streams for external data sources | |
CN113127462A (en) | Integrated big data management platform based on life cycle management | |
US11263026B2 (en) | Software plugins of data virtualization-based architecture | |
Rasool et al. | Replica placement in multi-tier data grid | |
CN113965623B (en) | Industrial control network data acquisition system based on mobile agent | |
US12159050B1 (en) | Data store conversion from a source data store to a virtual target | |
Sundara Kumar et al. | Improving big data analytics data processing speed through map reduce scheduling and replica placement with HDFS using genetic optimization techniques | |
US20250028569A1 (en) | Methods, systems, and computer readable media for providing and using shuffle templates to distribute data among workers comprising compute resources in a data center | |
US20080189288A1 (en) | Query governor with network monitoring in a parallel computer system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |