Disclosure of Invention
The invention aims to solve the technical problem of providing a geological mapping information extraction method and system based on multi-source data fusion, which enable the data fusion process to be more flexible and intelligent, and can be dynamically adjusted according to the characteristics and real-time states of different data sources, so that the accuracy and efficiency of data fusion are improved.
In order to solve the technical problems, the technical scheme of the invention is as follows:
in a first aspect, a method for extracting geological mapping information based on multi-source data fusion, the method comprising:
step 1, collecting multisource geological data, including satellite remote sensing data, geological exploration data, geophysical data, geochemical data and ground actual measurement data;
step 2, preprocessing the multi-source geological data to obtain preprocessed multi-source geological data;
Step 3, calculating a dynamic adjustment factor of each data source based on the information amount provided by each data source, the frequency of data updating and the accuracy of historical data;
step 4, fusing the preprocessed multi-source geological data by utilizing the dynamic adjustment factors to obtain fused geological data;
Step 5, extracting target geological mapping information by using a geological information extraction algorithm based on the fused geological data;
and 6, carrying out information screening on the extracted target geological mapping information to form a final geological mapping information extraction result.
Further, collecting multi-source geological data, comprising:
randomly generating a set of combinations of data sources as an initial population, each combination representing a solution, one solution representing a selected set of data sources;
selecting a corresponding individual from the current population as a parent according to the fitness function to generate a next generation;
Randomly mutating the newly generated child individuals, replacing part of the old population with the newly generated child individuals to form a new generation population, and repeating the processes of selecting, crossing, mutating and generating the new population until the preset iteration times are reached to obtain an optimized data source combination;
and collecting corresponding multi-source geological data according to the optimized data source combination.
Further, fitness functionThe calculation formula of (2) is as follows:
;
Wherein, Represent the firstThe number of valid, non-missing data entries in the data sources; Represent the first Total number of data entries in the data sources;、 And Is a weight parameter; representing a total number of data sources; And Is an index of data sources for traversing and identifying different data sources; Representing a complementarity score; Represents the first Data source numberRedundancy scoring between individual data sources; Is the first Correlation scores of the individual data sources and the target geological mapping information;
;
Wherein, Represent the firstThe first data sourceA value of the data entry; Represent the first Average value of all data in the data sources; representing the first of the geological mapping information of the target A value of the data entry; representing an average value of all data in the target geological mapping information; representing the total number of data entries; Is an index of data entries used to traverse specific data points in the data source.
Further, the firstData source numberRedundancy scoring between individual data sourcesThe calculation formula of (2) is as follows:
;
Wherein, Represent the firstThe first data sourceValue of data entry, complementarity scoringThe calculation formula of (2) is as follows:
;
Wherein, Represent the firstData source numberThe amount of information commonly contained in the individual data sources; Represent the first Data source numberTotal information amount after merging of the data sources.
Further, dynamic adjustment factorsThe calculation formula of (2) is as follows:
;
Wherein, 、AndRepresenting the weight parameters; Represent the first Information amount size of the individual data sources; Represent the first The data update frequency of the individual data sources; Represent the first The first data sourceA plurality of historical data points; Represent the first A true value; Representing the total number of data points, i.e. in calculating the first The number of data points considered in the accuracy of the historical data of the individual data sources; representing the total number of data sources, i.e. the number of different data sources considered in the dynamic adjustment process; a representation for traversing all data sources to calculate a total; Representing all data points used to traverse a particular data source to calculate accuracy.
Further, the method for fusing the preprocessed multi-source geological data by using the dynamic adjustment factor to obtain fused geological data comprises the following steps:
Multiplying the data in each data source by a corresponding dynamic adjustment factor to obtain weighted data source data;
determining average calculated units, traversing the data source for each average calculated unit and searching data points matched with the current unit so as to extract corresponding data values from all weighted data source data;
for each unit, carrying out average calculation on all the extracted weighted data values;
And (3) extracting corresponding data values and carrying out average calculation to calculate a comprehensive geological data value fused with a plurality of data source information for each unit, wherein the comprehensive geological data value is fused geological data.
Further, extracting target geological mapping information by using a geological information extraction algorithm based on the fused geological data, including:
Carrying out statistical analysis on the fused geological data to determine the numerical range and distribution characteristics of the data;
Setting a threshold according to the specific performance of the target geological feature in the data;
Traversing the fused geological data, judging each data point, judging whether the data point exceeds a set threshold value, and marking the data point as a target geological feature point if the value of the data point exceeds the set threshold value, otherwise marking the data point as a non-target point;
A new data table is created for storing the extracted target geological mapping information, and for each data point marked as a target geological feature, its location information and attribute values are added to the data table to obtain the target geological mapping information.
In a second aspect, a geological mapping information extraction system based on multi-source data fusion includes:
The collecting module is used for collecting multisource geological data, including satellite remote sensing data, geological exploration data, geophysical data, geochemical data and ground actual measurement data;
The preprocessing module is used for preprocessing the multi-source geological data to obtain preprocessed multi-source geological data;
the calculation module is used for calculating the dynamic adjustment factor of each data source based on the information quantity provided by each data source, the frequency of data updating and the accuracy of historical data;
The fusion module is used for fusing the preprocessed multi-source geological data by utilizing the dynamic adjustment factors so as to obtain fused geological data;
the extraction module is used for extracting target geological mapping information by using a geological information extraction algorithm based on the fused geological data;
and the screening module is used for carrying out information screening on the extracted target geological mapping information so as to form a final geological mapping information extraction result.
In a third aspect, a computing device includes:
one or more processors;
And a storage means for storing one or more programs that, when executed by the one or more processors, cause the one or more processors to implement the method.
In a fourth aspect, a computer readable storage medium has a program stored therein, which when executed by a processor, implements the method.
The above scheme of the invention at least comprises the following beneficial effects.
According to the invention, by collecting and integrating multi-source geological data such as satellite remote sensing data, geological exploration data, geophysical data, geochemical data, ground measured data and the like, comprehensive and multi-angle capturing of geological information is realized, so that the integrity and accuracy of geological mapping information are improved.
The invention innovatively introduces a dynamic adjustment factor, and the mechanism can dynamically adjust the weight of data fusion according to the information quantity provided by each data source, the frequency of data update and the accuracy of historical data. The method not only enables the data fusion process to be more flexible, but also can reflect the reliability and timeliness of different data sources in real time, and further improves the accuracy of the fused geological data.
By preprocessing the multi-source geological data, the method effectively eliminates the problems of format difference among the data, non-uniform coordinate system and the like, simplifies the data processing flow and improves the geological information extraction efficiency.
After the target geological mapping information is extracted, an information screening step is further carried out, and the link is helpful for removing redundant and error information, so that the finally formed geological mapping information extraction result is more accurate and reliable.
Detailed Description
Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
As shown in fig. 1, an embodiment of the present invention proposes a geological mapping information extraction method based on multi-source data fusion, the method comprising the following steps:
step 1, collecting multisource geological data, including satellite remote sensing data, geological exploration data, geophysical data, geochemical data and ground actual measurement data;
step 2, preprocessing the multi-source geological data to obtain preprocessed multi-source geological data;
Step 3, calculating a dynamic adjustment factor of each data source based on the information amount provided by each data source, the frequency of data updating and the accuracy of historical data;
step 4, fusing the preprocessed multi-source geological data by utilizing the dynamic adjustment factors to obtain fused geological data;
Step 5, extracting target geological mapping information by using a geological information extraction algorithm based on the fused geological data;
and 6, carrying out information screening on the extracted target geological mapping information to form a final geological mapping information extraction result.
In the embodiment of the invention, the geological condition can be reflected more comprehensively and at multiple angles by collecting various data sources including satellite remote sensing data, geological exploration data, geophysical data, geochemical data and ground actual measurement data. The preprocessing step can remove noise, correct errors and unify data formats and coordinate systems, so that the quality and consistency of data are ensured. By considering the information quantity, the updating frequency and the historical accuracy of each data source, a dynamic adjustment factor is allocated to each data source, so that the contribution of different data sources can be weighed more accurately during data fusion, and the accuracy of fusion data is improved. The dynamic adjustment factors are utilized to conduct multi-source data fusion, and information from different data sources can be effectively integrated. The target geological mapping information can be rapidly and accurately extracted from the fused geological data through a geological information extraction algorithm, and the efficiency and accuracy of information acquisition are improved. Through screening the extracted geological mapping information, redundant and error data can be removed, and the finally extracted geological mapping information is ensured to be more accurate and reliable.
In a preferred embodiment of the present invention, the collecting multi-source geological data in step 1 may include:
Randomly generating a set of combinations of data sources as an initial population, each combination representing a solution, one solution representing a selected set of data sources;
Step 12, constructing an adaptability function for evaluating the combination quality of each data source, selecting a corresponding individual from the current population as a parent according to the adaptability function, and generating a next generation;
Step 13, combining the selected father individuals to generate new offspring individuals, carrying out random variation on the newly generated offspring individuals, replacing part of old populations with the newly generated offspring individuals to form new generation populations, and repeating the processes of selecting, crossing, varying and generating the new populations until the preset iteration times are reached to obtain optimized data source combinations;
And 14, collecting corresponding multi-source geological data according to the optimized data source combination.
In the embodiment of the invention, step 11 can ensure wide coverage and diversity of data by defining the source and type of the data, and provides a comprehensive view angle for subsequent analysis. The initial population is randomly generated, so that the diversity of the data source combination is increased, and the method is helpful for exploring more possible solutions in the subsequent optimization process. And step 12, constructing a fitness function to provide a quantized standard for evaluating the quality of the data source combination, and selecting a parent through the fitness function to ensure that the high-quality data source combination features are inherited to the next generation, thereby improving the quality of the whole data source combination. Step 13, generating new offspring individuals through crossover and mutation operations, and searching more potential data source combination modes. The evolutionary mechanism is helpful for the data source combination with higher adaptability to be gradually highlighted in the iterative process, so that the data source combination which is more suitable for actual demands is found. Step 14, the obtained data source combination better meets the actual requirements through the optimization process. According to the optimized combination, data are collected, so that the data collection efficiency and accuracy can be improved, and the collected data are ensured to have higher value for the subsequent geological mapping information extraction work. At the same time, this also reduces unnecessary data collection effort, saving resources and time costs.
When applied specifically, the method specifically comprises the following steps:
Step 11, list data sources:
and the satellite remote sensing data provides information such as topography, landform, vegetation coverage and the like of the earth surface.
Geological exploration data, including rock, soil and mineral data obtained by drilling, pit test and other solid exploration.
Geophysical exploration data, namely, data of an underground structure are detected through physical methods such as gravity, magnetic methods, electric methods and the like.
Geochemical detection data, which is to analyze the chemical element distribution in soil and rock and is used for mineral resource evaluation and the like.
The ground actual measurement data comprises ground direct observation data such as GPS measurement, leveling measurement and the like.
Determining the data type:
For satellite remote sensing data, the data type may be multispectral images, hyperspectral images, radar images, and the like. The geological survey data may include core sample analysis, soil analysis reports, and the like. The geophysical data may be gravity anomaly maps, magnetic anomaly maps, or the like. Geochemical data relates to elemental distribution maps, abnormal region identification, and the like. The ground measured data provides accurate geographical location and terrain elevation information.
Constructing a data source combination space:
A list or matrix is created representing all possible combinations of data sources. For example, binary encoding may be used, where a "1" indicates that the data source is selected and a "0" indicates that it is not selected. For each data source, consider the case where it is selected or not selected, thereby generating all possible combinations.
Randomly generating an initial population:
the population size, i.e., the number of random data source combinations to be generated, is set. Several combinations are randomly selected in the above constructed data source combination space using a random number generator. Ensuring that each combination (individual) is unique or accepts a degree of repetition, depending on the design requirements of the algorithm. Each randomly generated combination of data sources represents an initial solution, namely one possible data collection scheme.
And step 12, designing an fitness function according to the geological mapping requirement, quantitatively evaluating the quality of each data source combination, evaluating each individual in the initial population by applying the fitness function to obtain a fitness value of each individual, and selecting the individual with excellent performance as a parent according to the fitness value, wherein the fitness function is usually realized through a selection mechanism in genetic algorithms such as roulette selection, tournament selection and the like.
Step 13, pairing the selected parent individuals, and exchanging part of genes, namely certain elements in the data source combination, with each other through crossing operations (such as single-point crossing, multi-point crossing and the like), so as to generate new child individuals. And carrying out random variation on the newly generated offspring individuals, such as randomly changing the selection state of a certain data source, so as to increase the diversity of the population. And replacing the individuals in part of the old population with the newly generated offspring individuals to form a new generation population. This process typically follows a certain replacement strategy, such as preserving optimal individuals, random replacement, etc. Repeating the processes of selecting, crossing, mutating and generating the new population until the preset iteration times are reached or other stopping conditions are met.
And step 14, after the iterative optimization is finished, selecting an individual with the highest fitness from the final population as an optimal data source combination, and actually carrying out multi-source geological data collection work according to the optimal data source combination, wherein the steps of coordination with different data providers, conversion and unification of data formats, preliminary verification and cleaning of data and the like are involved.
In a preferred embodiment of the invention, the fitness functionThe calculation formula of (2) is as follows:
;
Wherein, Represent the firstThe number of valid, non-missing data entries in the data sources; Represent the first Total number of data entries in the data sources;、 And Is a weight parameter; representing a total number of data sources; And Is an index of data sources for traversing and identifying different data sources; Representing a complementarity score; Represents the first Data source numberRedundancy scoring between individual data sources; Is the first Correlation scores of the individual data sources and the target geological mapping information;
;
Wherein, Represent the firstThe first data sourceA value of the data entry; Represent the first Average value of all data in the data sources; representing the first of the geological mapping information of the target A value of the data entry; representing an average value of all data in the target geological mapping information; representing the total number of data entries; Is an index of data entries used to traverse specific data points in the data source.
In an embodiment of the present invention, by considering the ratio of the number of valid, non-missing data entries to the total number of data entries in each data sourceThe function can preferentially select those sources of data that are of higher data quality and have fewer missing values, thereby improving the quality of the overall data set. By calculating complementarity scores between data sourcesThe function can facilitate selection of data sources that are complementary in content, thus more fully covering the geological mapping information and reducing information dead zones. By taking into account redundancy scores between data sourcesThe function can avoid selecting data sources that are highly similar or repetitive, thereby reducing data redundancy and improving the efficiency of data processing and analysis. By calculating a relevance score for each data source to the target geologic mapping informationThe function can ensure that the selected data source is closely related to the target task, thereby improving the accuracy and reliability of geological mapping.
In a preferred embodiment of the invention, the firstData source numberRedundancy scoring between individual data sourcesThe calculation formula of (2) is as follows:
;
Wherein, Represent the firstThe first data sourceValue of data entry, complementarity scoringThe calculation formula of (2) is as follows:
;
Wherein, Represent the firstData source numberThe amount of information commonly contained in the individual data sources; Represent the first Data source numberTotal information amount after merging of the data sources.
In an embodiment of the invention, redundancy scoringBy comparison of the firstData source numberThe values of the same data entries in the data sources are accurately quantized using the ratio of the minimum and maximum values. This helps identify and avoid highly redundant combinations of data sources during the data source selection process, thereby improving the efficiency of data integration and processing. Complementarity scoringThe redundancy score and the information overlapping degree between the data sources are combined. It takes into account not only the complementation of the data sources on a numerical level (represented by redundancy scoring) but also the complementation of the data sources on the information content, i.e. the ratio of the jointly contained information quantity to the total information quantity after merging. This facilitates selection of those combinations of data sources that complement each other in content and that provide a more comprehensive information coverage. By considering the redundancy score and the complementarity score in combination, a more optimal data source combination strategy can be formulated. The diversity and the comprehensiveness of information content are maximized while the necessary information overlapping among the data sources is maintained, so that the accuracy and the integrity of geological mapping information are improved. The redundant data sources can be accurately identified and removed, unnecessary data processing and analysis work can be reduced, computing resources are saved, and the overall data processing efficiency is improved. Meanwhile, the optimized data source combination is also helpful to simplify the flow of data cleaning and integration.
And 2, preprocessing the multi-source geological data, wherein the implementation process is as follows:
geological data is collected from various available data sources, which may include geological survey reports, remote sensing images, map data, geologic sample analysis results, and the like. And classifying and sorting the collected data according to the data type and the source, ensuring the clear organization structure of the data, and facilitating the subsequent processing.
Duplicate data entries are checked and removed to avoid data redundancy. Missing values are identified and processed, and interpolation, deletion or filling strategies based on statistical methods can be adopted. Abnormal values are corrected or deleted, which may be due to measurement errors or data entry errors. The data in different data sources are converted into a unified format, such as CSV, geoJSON or Shapefile, so as to facilitate subsequent data integration and analysis. The same Coordinate Reference System (CRS) is ensured for all data to be aligned accurately in the spatial analysis. The data is standardized, such as scaled, centered, etc., to eliminate the influence of different data sources due to different dimensions or units. For classified data, encoding (e.g., single-hot encoding) may be used to convert it into a format suitable for numerical analysis. Relevant features, such as rock type, stratum thickness, geological structure and the like, are extracted from the original data according to the requirements of geological mapping tasks. Feature selection techniques (e.g., correlation analysis, principal component analysis, etc.) are used to reduce the number of features while retaining information most useful to the task.
In a preferred embodiment of the present invention, in step 3, the dynamic adjustment factor is adjustedThe calculation formula of (2) is as follows:
;
Wherein, 、AndRepresenting the weight parameters; Represent the first Information amount size of the individual data sources; Represent the first The data update frequency of the individual data sources; Represent the first The first data sourceA plurality of historical data points; Represent the first A true value; Representing the total number of data points, i.e. in calculating the first The number of data points considered in the accuracy of the historical data of the individual data sources; representing the total number of data sources, i.e. the number of different data sources considered in the dynamic adjustment process; a representation for traversing all data sources to calculate a total; Representing all data points used to traverse a particular data source to calculate accuracy.
In the embodiment of the invention, the following steps are included:
The formula comprehensively considers the information quantity of the data source Frequency of data updateAnd the accuracy of the historical data (by comparisonAnd) Thereby enabling a comprehensive assessment of the quality of the data source. By weight parameter、AndThe importance of each evaluation dimension can be flexibly adjusted according to actual requirements.The dynamic computation of (1) enables the system to adjust its weight in real time according to the performance of the data source, thereby selecting a better quality data source among the plurality of data sources. Such dynamic optimization helps to improve the efficiency and accuracy of data processing. In the data fusion process, use is made ofAs an adjustment factor, the contribution degree of each data source can be dynamically adjusted according to the quality of the data source, so that the overall quality of the fused data is improved. By considering the accuracy of the historical data, the formula can identify and reduce the influence of unstable or wrong data sources to a certain extent, thereby enhancing the robustness of the system. The design of the formula considers a plurality of data sources (represented by q) and a plurality of data points (represented by p), so that the formula has good expandability and can easily cope with the situation that the number of the data sources and the data volume are increased.
In a preferred embodiment of the present invention, the step 4 of fusing the preprocessed multi-source geological data by using the dynamic adjustment factor to obtain the fused geological data includes:
Step 41, multiplying the data in each data source by a corresponding dynamic adjustment factor to obtain weighted data source data;
Step 42, determining average calculated units, traversing the data source and searching data points matched with the current units for each average calculated unit so as to extract corresponding data values from all weighted data source data;
Step 43, for each unit, carrying out average calculation on all the extracted weighted data values;
and step 44, calculating a comprehensive geological data value fused with a plurality of data source information for each unit by extracting the corresponding data value and averaging calculation, wherein the comprehensive geological data value is fused geological data.
In the embodiment of the invention, the data in each data source is weighted by the dynamic adjustment factors, so that the quality and reliability of different data sources can be fully considered, and the higher quality data sources are given greater weight in the fusion process. This helps to improve the accuracy and reliability of the post-fusion geological data. The use of dynamic adjustment factors allows for dynamic adjustment of the weights of the data sources according to their actual performance, which means that the system is able to optimize the weight allocation of the data sources in real time during the data fusion process. This flexibility helps to better cope with the changing quality of the data source and ensures the stability and reliability of the fusion result. By carrying out average calculation on the data of the plurality of weighted data sources, information provided by different data sources can be comprehensively considered, so that comprehensive geological data values with more comprehensive and more representative values can be obtained. The method is beneficial to improving the comprehensive analysis capability of geological data and providing more powerful data support for subsequent works such as geological exploration, resource evaluation and the like. In the data fusion process, the data deviation and uncertainty possibly caused by a single data source can be reduced through average calculation. The weighted fusion of multiple data sources helps to smooth data fluctuations and improve data stability and consistency. The fused geological data provides a more accurate and comprehensive information foundation for a decision maker, and is helpful for making a more scientific and reasonable decision scheme. In the fields of geological engineering, mineral resource development and the like, accurate fusion data is a key for formulating effective strategies. The method promotes cooperative application among geological data of different sources, and fully exerts complementary advantages of multi-source data. By means of fusion processing, information among different data sources can be effectively integrated, and the overall utilization efficiency and value of data are improved.
When applied specifically, the method specifically comprises the following steps:
And step 41, acquiring all the preprocessed geological data sources, and determining a dynamic adjustment factor corresponding to each data source. For each data point in each data source, the value is multiplied by the corresponding dynamic adjustment factor, which is done at the data source level, meaning that all data points of the entire data source are multiplied by the same factor. After the weighting process is completed, the weighted data source data are stored for later use.
Step 42, determining an average calculation unit according to the fusion requirement, wherein the unit can be a geographic grid, a time interval, a specific geological feature and the like, and traversing all weighted data sources for each determined calculation unit. In each data source, a data point is found that matches the current computational unit. Once a matching data point is found, its weighted data value is extracted. These extracted data values will be used for subsequent average calculations.
Step 43, for each calculation unit, summarizing all the data values extracted from the weighted data sources. And carrying out average calculation on the summarized data values to obtain an average weighted data value of the calculation unit. This value represents a composite index that merges multiple data source information.
Step 44, repeating step 42 and step 43 until all the calculation units are processed. Eventually, each computational unit will obtain a comprehensive geological data value. These values constitute fused geological data that can be used for further geological analysis or decision support.
For example, assume that there are three sources of geologic data (A, B, C), each containing metal content data at a different location in a region. It is desirable to fuse these data sources to obtain a more accurate estimate of the metal content for each location. Step 41, in which three geological data sources (A, B, C) are first weighted. The metal content data contained in each data source is multiplied by a dynamic adjustment factor according to the accuracy and resolution evaluation results. Specifically:
The data source a, because of its higher accuracy and resolution, is given a greater weight with a dynamic adjustment factor of 0.6. This means that each metal content data in data source a will be multiplied by 0.6.
The data source B is slightly lower in accuracy and resolution than the data source a, so its dynamic adjustment factor is 0.3. Each metal content data in data source B will be multiplied by this factor.
The data source C, which is lower in accuracy and resolution than A and B, is given less weight and dynamic adjustment factor of 0.1. Also, all data in data source C will be multiplied by 0.1.
Through the weighting process, the data of the data source can be correspondingly adjusted according to the quality of the data source, and a foundation is laid for subsequent data fusion.
In this step, a grid of computational units of geographic locations is determined, step 42. This means that the whole area is divided into several grids, each representing a specific geographical location.
For each grid location, the metal content data corresponding to the grid needs to be looked up in the weighted data source A, B, C. Each grid has a central or representative point whose geographic coordinates (longitude and latitude) are known. And reading longitude and latitude information in the data sources, wherein the data can be stored in different formats, such as a decimal system, a degree minute second and the like, and the data needs to be uniformly converted into the same format for comparison. A coordinate matching algorithm is then used to determine which grid each data point in the data source belongs to, which can be done by calculating the distance of the coordinates of the data point from the center point or boundary of each grid. A data point is considered to belong to a grid if its coordinates fall within the boundaries of the grid or are closest to the center point of the grid. If the grid to which the data point belongs is determined, the corresponding data (e.g., metal content) can be extracted from the data source and correlated with the grid coordinates. This typically involves creating a data structure (e.g., a database table or array) for storing the data values for each grid.
At step 43, after extracting weighted metal content data for each grid location from the data source A, B, C, an average calculation is then performed. The goal of this step is to fuse information from different data sources to obtain a more accurate and comprehensive estimate of metal content.
Taking a certain grid position as an example, let the weighted data extracted from the data source a be 10, 15 from the data source B, and 20 from the data source C. To obtain the integrated metal content data value for this grid location, the three values are added (10+15+20=45) and then divided by the number of data sources (i.e., 3) to obtain an average value of 15.
This average represents an estimate of the metal content of the grid location fused with information from three sources. In this way, the advantages of multiple data sources can be utilized to improve the accuracy and reliability of metal content estimation.
After the average calculation of all grid positions is completed, a composite metal content data value is finally obtained for each grid position, step 44. These data values are based on weighted and averaged results of multiple data sources and are therefore more accurate and comprehensive than the information provided by a single data source. The integrated metal content data values form fused geological data, and geological resources can be better understood and utilized by fusing information of a plurality of data sources.
In a preferred embodiment of the present invention, the step 5 of extracting the target geological mapping information by using a geological information extraction algorithm based on the fused geological data includes:
Step 51, performing statistical analysis on the fused geological data to determine the numerical range and distribution characteristics of the data, wherein the statistical analysis on the fused geological data comprises calculating statistics such as maximum value, minimum value, average value, standard deviation and the like of the data so as to know the whole numerical range of the data.
Step 52, setting a threshold according to the specific performance of the target geological feature in the data, specifically comprising determining that the target is to identify the region with high metal content, consulting the relevant geological research literature, report or database, knowing the historical data and research results of the metal content in the research region, and paying special attention to the metal content value of the region with high metal content identified in the previous research. And according to the obtained statistical analysis result of the geological data after fusion, observing the numerical range and distribution of the metal content, and paying attention to the concentrated trend, the discrete degree and the possible abnormal value of the data. Based on the values of the high metal content areas identified in the previous study, a threshold value of the metal content is preliminarily set in combination with the distribution of the current data, and the threshold value can be a specific value or a range of values. The data is screened using the initially set threshold to find data points that exceed the threshold. The screened high metal content regions were compared with those in previous studies and checked for consistency. Meanwhile, the screening result can be verified by using the knowledge and experience of geological specialists. If the threshold value which is preliminarily set is found to be too high or too low, the screening result has larger deviation from the actual situation, and adjustment is needed according to the verification result. The adjustment may be to raise or lower the threshold until the screening result matches the actual geologic condition. The verified and adjusted threshold will be determined as the final threshold.
Step 53, traversing the fused geological data, judging each data point to see whether the data point exceeds a set threshold value, if the value of the data point exceeds the set threshold value, marking the data point as a target geological feature point, otherwise, marking the data point as a non-target point, creating a new data table for storing the extracted target geological mapping information, and adding the position information and the attribute value of each data point marked as the target geological feature into the data table to obtain the target geological mapping information. For each data point, it is determined whether its attribute value (e.g., metal content, rock type, etc.) exceeds a set threshold. If the attribute value of a data point exceeds a threshold value, the data point is marked as a target geologic feature point, indicating that it may belong to a geologic feature region of interest. Conversely, if the attribute value of the data point does not exceed the threshold value, it is marked as a non-target point.
Finally, a new data table is created to store the extracted target geological mapping information. This new data table will contain all relevant information marked as target geologic feature points, such as their location coordinates (latitude and longitude), attribute values (e.g., metal content, etc.). In this way, the extracted target geological feature information can be conveniently queried, analyzed and displayed.
In the embodiment of the invention, the numerical range and the distribution characteristics of the data can be more accurately known by carrying out statistical analysis on the fused geological data, which is helpful for identifying and correcting abnormal values or errors in the data, thereby improving the overall data quality. By setting a specific threshold to identify the target geological feature, accurate capture of the specific geological phenomenon can be achieved. The method can be used for rapidly screening out data points meeting specific conditions from a large amount of data, and provides powerful support for geological research and resource exploration. The process of automatically traversing and judging whether the data points exceed the set threshold value can greatly reduce the time of manual analysis and processing and improve the efficiency of geological mapping work. By creating a new data table to store the extracted target geological mapping information, systematic management and efficient querying of such information can be achieved. This facilitates subsequent data analysis and decision support, making the geologic information easier to access and use.
Step 6 above, performing information screening on the extracted target geological mapping information to form a final geological mapping information extraction result, which may include:
and step 61, evaluating the data quality of the extracted target geological mapping information. This may be done by checking the data for null, outliers or duplicate values. For example, if the attribute value of one data point deviates significantly from other data points or is not within an expected range of values, it may be considered an outlier.
Step 62, setting screening conditions using the statistics (e.g., average, standard deviation, etc.) calculated in step 51. For example, data points with attribute values within a standard deviation range of a certain multiple of the average value can be screened out, so that extreme outliers can be eliminated. Traversing the extracted target geological mapping information. For each data point, screening is performed according to whether the attribute value meets the screening condition. Data points meeting the condition are retained and data points not meeting the condition are eliminated.
And step 63, sorting the data points remained after screening into a final geological mapping information extraction result. The result may be a reduced data set that contains only high quality, screening-eligible target geologic feature points. By the method, the step 6 can automatically screen out high-quality geological mapping information, so that subjectivity and complexity of manual screening are avoided, and screening efficiency and accuracy are improved.
As shown in fig. 2, an embodiment of the present invention further provides a geological mapping information extraction system based on multi-source data fusion, including:
The collecting module is used for collecting multisource geological data, including satellite remote sensing data, geological exploration data, geophysical data, geochemical data and ground actual measurement data;
The preprocessing module is used for preprocessing the multi-source geological data to obtain preprocessed multi-source geological data;
the calculation module is used for calculating the dynamic adjustment factor of each data source based on the information quantity provided by each data source, the frequency of data updating and the accuracy of historical data;
The fusion module is used for fusing the preprocessed multi-source geological data by utilizing the dynamic adjustment factors so as to obtain fused geological data;
the extraction module is used for extracting target geological mapping information by using a geological information extraction algorithm based on the fused geological data;
and the screening module is used for carrying out information screening on the extracted target geological mapping information so as to form a final geological mapping information extraction result.
It is noted that the system is a system corresponding to the above method, and all implementation manners in the above method embodiment are applicable to the embodiment, so that the same technical effect can be achieved.
Embodiments of the invention also provide a computing device comprising a processor, a memory storing a computer program which, when executed by the processor, performs a method as described above. All the implementation manners in the method embodiment are applicable to the embodiment, and the same technical effect can be achieved.
Embodiments of the present invention also provide a computer-readable storage medium storing instructions that, when executed on a computer, cause the computer to perform a method as described above. All the implementation manners in the method embodiment are applicable to the embodiment, and the same technical effect can be achieved.
While the foregoing is directed to the preferred embodiments of the present invention, it will be appreciated by those skilled in the art that various modifications and adaptations can be made without departing from the principles of the present invention, and such modifications and adaptations are intended to be comprehended within the scope of the present invention.