CN118037304B

CN118037304B - Financial risk grade marking method and system based on data mining

Info

Publication number: CN118037304B
Application number: CN202410322464.8A
Authority: CN
Inventors: 李忠鑫; 范修祥; 翁晓炜
Original assignee: Beijing Qingqian Information Technology Co ltd
Current assignee: Beijing Qingqian Information Technology Co ltd
Priority date: 2024-03-20
Filing date: 2024-03-20
Publication date: 2024-12-27
Anticipated expiration: 2044-03-20
Also published as: CN118037304A

Abstract

The present invention relates to the technical field of financial risk assessment, and in particular to a method for labeling financial risk levels based on data mining. The method comprises the following steps: obtaining target financial product data; determining the business field of the target financial product based on the target financial product data, thereby obtaining the industry data to which the product belongs; obtaining data collection target city data; collecting economic indicator data of the target city based on the data collection target city data, thereby obtaining an economic indicator data set; mining the implicit economic operation law of the economic indicator data set, thereby obtaining economic operation law data; labeling the target financial product with preliminary risk factors based on the economic operation law data, thereby obtaining preliminary risk factor data. This solution can achieve risk level labeling of financial products and investors through data mining and multi-dimensional evaluation methods.

Description

Financial risk grade marking method and system based on data mining

Technical Field

The invention relates to the technical field of financial risk assessment, in particular to a financial risk level marking method based on data mining.

Background

Financial risk management is currently considered in the financial field to be one of the critical issues that financial institutions and investors must face. Effective financial risk management requires comprehensive risk assessment and labeling of financial products and portfolios. Traditional financial risk assessment methods mainly depend on statistical models and expert experiences, however, the methods have obvious limitations such as high subjectivity and large model complexity. Conventional methods typically rely on static models, which are difficult to accommodate for the dynamically changing demands of the financial markets. Most of the existing financial risk assessment methods adopt a traditional statistical method, and are often only qualitatively classified from a single dimension, so that multi-dimensional quantitative assessment is difficult to perform. This results in a lack of comprehensive insight into financial risk.

Disclosure of Invention

Accordingly, the present invention is directed to a financial risk level labeling method based on data mining, so as to solve at least one of the above-mentioned problems.

In order to achieve the above purpose, a financial risk level labeling method based on data mining comprises the following steps:

Step S1, acquiring target financial product data, determining the business field of the target financial product according to the target financial product data, and thus acquiring industry data of the product;

step S2, acquiring data acquisition target city data, acquiring economic index data of a target city according to the data acquisition target city data so as to acquire an economic index data set, mining an implicit economic operation rule of the economic index data set so as to acquire economic operation rule data, and marking a preliminary risk factor of a target financial product according to the economic operation rule data so as to acquire preliminary risk factor data;

Step S3, performing asset association network analysis on the enterprise to which the target financial product belongs to obtain asset association network data, performing technical innovation capability assessment on the enterprise to which the target financial product belongs to obtain technical innovation capability assessment data, and performing risk resistance assessment on the enterprise to which the target financial product belongs according to the asset association network data and the technical innovation capability assessment data to obtain comprehensive risk resistance data of the enterprise;

step S4, acquiring enterprise operation data of a target financial product to obtain an enterprise operation data set, extracting features and reducing dimensions of the enterprise operation data set to obtain product operation risk feature data, and grading the product operation risk feature data to obtain product potential operation risk data;

Step S5, constructing a comprehensive risk factor network for the preliminary risk factor data and the product potential risk data so as to obtain product risk factor network architecture data, marking the final risk level of the target financial product according to the product risk factor network architecture data so as to obtain final risk level data, and performing multidimensional visualization processing for the final risk level data so as to obtain a product risk level chart;

Step S6, carrying out relevant record collection on the target financial product account of the investor so as to obtain the account record data of the investor, carrying out personal asset distribution analysis on the investor so as to obtain personal asset distribution data, carrying out credit risk assessment on the investor according to the account record data of the investor and the personal asset distribution data so as to obtain the credit risk assessment data of the investor, and sending the credit risk assessment data of the investor to an enterprise to which the target financial product belongs.

The invention can obtain the industry data of the product by acquiring the data of the target financial product and determining the service field of the target financial product. The value of this step is to provide basic information for subsequent risk assessment, as there may be differences in risk factors faced by financial products of different industries. By acquiring industry data of products, financial risk grades can be estimated and marked more accurately. The economic operation rule data can be obtained by acquiring the economic index data of the data acquisition target city and carrying out implicit economic operation rule mining. These regular data may be used to label the target financial products with preliminary risk factors, thereby providing more comprehensive information for risk assessment. By analyzing the economic operation law, the risk level of the financial product can be estimated more accurately. Asset correlation network data and technical innovation capability assessment data can be obtained by performing asset correlation network analysis and technical innovation capability assessment on an enterprise to which a target financial product belongs. These data may be used to evaluate the risk resistance of the enterprise and obtain comprehensive risk resistance data for the enterprise. Considering that the risk of a financial product is often related to the risk of an enterprise to which the financial product belongs, this step is valuable in providing an important indicator for assessing the risk of the financial product. The product management risk feature data can be obtained by collecting, feature extracting and dimension reducing the operation data of the enterprise to which the target financial product belongs. these data may be used to rate the potential business risk of the product, thereby obtaining the potential risk data for the product. By constructing a comprehensive risk factor network for the preliminary risk factor data and the product potential risk data, the relevance among different factors can be considered in air intake risk assessment. Therefore, the risk level of the target financial product can be estimated more comprehensively, and the accuracy and reliability of estimation are improved. And marking the final risk level of the target financial product according to the product risk factor network architecture data, and incorporating the comprehensive risk factors into the evaluation result so that the evaluation result has higher operability. By labeling risk factors of financial products, evaluating enterprise anti-risk capability and analyzing potential risk of products, the risk level of the financial products can be comprehensively evaluated from multiple angles. This helps the financial institution to more accurately understand the potential risk of the product, thereby taking appropriate risk management strategies and measures. By collection of the investor account related records and analysis of the personal asset distribution, the investor's personal situation and asset status can be known. Based on these data, a credit risk assessment is performed for the investor, which can assess the level of credit risk of the investor in the financial product. The credit risk assessment data of the investors are sent to enterprises to which the target financial products belong, so that the enterprises can more comprehensively know the credit conditions of the investors, and risks are better managed. The invention realizes grade marking of financial risks, and realizes bidirectional information transmission and interaction between investors and enterprises through credit risk analysis and intelligent marking of investors. In the process of the two-way information transmission and interaction, the financial institution can not only better know the user, but also transmit risk management and financial advice to the user to help the user make a more intelligent financial decision. Meanwhile, the user can also provide own demands and feedback through interaction with the financial institution, so that the financial institution can be promoted to better meet the demands of the user, and products and services can be continuously optimized and improved. In summary, the risk assessment method and the risk assessment system can comprehensively consider a plurality of factors such as industries, economic environments, enterprise risk resistance, influence of product speakers, credit status of investors and the like of financial products, and comprehensively assess and grade label the risks of the financial products. conventional assessment methods typically consider only a single dimension, making it difficult to fully understand the risk of a financial product. The risk of the financial product can be comprehensively estimated from different angles by comprehensively considering a plurality of factors based on the data mining method, and the accuracy and reliability of estimation are improved. Traditional evaluation methods often rely on static models, and are difficult to adapt to dynamic changes of market environments. The method based on data mining can update the evaluation result in time by monitoring the changes of factors such as economic index data, influence of product speakers and the like in real time, so that the method is better suitable for market changes. The data mining based method can dynamically evaluate with the continuous updating and changing of data. through real-time monitoring data and combining machine learning and model updating, risk level labels can be timely adjusted and updated to adapt to changes of market environments.

Drawings

Other features, objects and advantages of the invention will become more apparent upon reading of the detailed description of a non-limiting implementation, made with reference to the accompanying drawings in which:

Fig. 1 is a schematic flow chart of steps of a financial risk level labeling method based on data mining according to an embodiment.

Fig. 2 shows a detailed step flow diagram of step S2 of an embodiment.

Fig. 3 shows a detailed step flow diagram of step S25 of an embodiment.

Detailed Description

The following is a clear and complete description of the technical method of the present patent in conjunction with the accompanying drawings, and it is evident that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present invention without making any inventive effort, are intended to fall within the scope of the present invention.

Furthermore, the drawings are merely schematic illustrations of the present invention and are not necessarily drawn to scale. The same reference numerals in the drawings denote the same or similar parts, and thus a repetitive description thereof will be omitted. Some of the block diagrams shown in the figures are functional entities and do not necessarily correspond to physically or logically separate entities. The functional entities may be implemented in software or in one or more hardware modules or integrated circuits or in different networks and/or processor methods and/or microcontroller methods.

It will be understood that, although the terms "first," "second," etc. may be used herein to describe various elements, these elements should not be limited by these terms. These terms are only used to distinguish one element from another element. For example, a first element could be termed a second element, and, similarly, a second element could be termed a first element, without departing from the scope of example embodiments. The term "and/or" as used herein includes any and all combinations of one or more of the associated listed items.

In order to achieve the above objective, referring to fig. 1 to 3, the present invention provides a financial risk level labeling method based on data mining, the method includes the following steps:

Specifically, for example, historical price, trade volume, etc. related data for the stock investment product may be obtained from an API or database of the financial data provider. The acquired financial product data is analyzed, including statistical features, price trend analysis, etc., to understand the basic condition of the product. And determining the business field of the product according to the acquired financial product data, for example, determining that the product belongs to the financial service industry.

specifically, for example, the data collection target city name may be acquired as the target city of data collection. And cleaning and arranging the acquired economic index data, and ensuring the accuracy and the integrity of the data. The economic index data is analyzed and mined by using a statistical analysis method, a time sequence analysis or a machine learning algorithm and the like to find an implicit economic operation rule in the economic index data. And applying the economic operation rule data obtained by mining to risk labeling of the target financial product. For example, the risk level of the financial product may be noted using fluctuations in certain economic indicators as preliminary risk factors.

Specifically, for example, financial reports, annual reports, and other published information of the target enterprise may be collected. And analyzing the information of the asset structure, investment relation, association sub-company and the like of the enterprise by using a data mining and relationship analysis technology, such as an association rule mining algorithm. And constructing an asset association network by using a graph theory algorithm and a network analysis method, such as a graph traversal algorithm and a community discovery algorithm, and determining asset relationships and interaction between the inside and the outside of the enterprise. Financial reports, annual reports, and other published information of the target enterprise are collected. And analyzing the information of the asset structure, investment relation, association sub-company and the like of the enterprise by using a data mining and relationship analysis technology, such as an association rule mining algorithm. And constructing an asset association network by using a graph theory algorithm and a network analysis method, such as a graph traversal algorithm and a community discovery algorithm, and determining asset relationships and interaction between the inside and the outside of the enterprise. Comprehensively considering the factors such as the asset structure, association relation, technical innovation capability and the like of enterprises, and constructing an anti-risk assessment model or index system. And carrying out quantitative evaluation on the anti-risk capacity of the enterprise by using methods such as statistical analysis, risk models, machine learning and the like, for example, regression analysis, monte Carlo simulation and the like. The evaluation result can comprise comprehensive indexes such as risk bearing capacity, compression resistance capacity, crisis coping capacity and the like of the enterprise.

Specifically, for example, enterprise operating data, including financial statements, sales data, production data, and the like, may be obtained from an internal database of the financial company. Related economic index data, such as industry production values, industry growth rates, etc., are obtained from the public data sources. External information such as news reports, industry analysis reports and the like of the manufacturing company is collected from the internet by utilizing a data crawler technology. And performing data cleaning and data preprocessing on the collected enterprise operation data to remove noise and abnormal values. Features related to product management risks, such as financial indexes (profit margin, asset liability ratio, etc.), market indexes (market share, industry competitiveness, etc.), etc., are extracted by applying feature engineering technology. The feature data of high dimension is converted into a low-dimensional representation using a dimension reduction method (e.g., principal component analysis) to reduce the dimension and complexity of the data. The business risk assessment model is built, and a supervised learning algorithm (such as a decision tree, a random forest, a support vector machine and the like) or an unsupervised learning algorithm (such as a clustering algorithm) can be used. And training an operation risk assessment model by using the marked historical data as a training set so as to predict the potential operation risk of the product. The products are rated according to the model predictive results, such as high risk, medium risk, low risk level.

Specifically, for example, the preliminary risk factor data and the product potential risk data can be comprehensively analyzed to construct a product risk factor network architecture. By taking into account the interrelationship and weights between the individual factors, a comprehensive risk factor network may be established. And according to the product risk factor network architecture data, carrying out final risk grade marking on the target financial product by using a predefined evaluation model or algorithm. This may involve setting risk level classification criteria and scoring system to evaluate the product based on the weights and impact levels of the different factors. And carrying out multidimensional visualization processing on the final risk level data to generate a product risk level chart. This may include using charts, graphs, and indicators to demonstrate the distribution of different risk levels, as well as comparisons with other product or market indicators.

In particular, for example, account record data associated with investors may be collected, including investment transaction records, running funds, and the like. Such data may be obtained from a trading platform or related financial institution used by the investor. The investors' personal assets are analyzed, including fund distribution, portfolio distribution, etc. This can be obtained by counting and analyzing the account record data. And carrying out credit risk assessment on the investors by using a credit assessment model or algorithm according to the account record data of the investors and the personal asset distribution data. This may involve analysis and comprehensive assessment of credit history, repayment capabilities, liability, asset stability, etc. And generating credit risk assessment data of the investors according to the credit risk assessment result. This may be a score or rating that is used to indicate the level of credit risk for the investor. The credit risk assessment data of the investor is transmitted to the investment company to which the target financial product belongs. This may be communicated over encrypted and secure communication channels, ensuring confidentiality and integrity of data.

Preferably, step S2 comprises the steps of:

s21, acquiring data acquisition target city data;

specifically, for example, the collected target object city, such as Beijing, shanghai, etc., may be rubbed with data.

S22, carrying out data source strategy formulation according to the data acquisition target city data so as to acquire data source strategy data;

Specifically, for example, characteristics and requirements of the data collection target city data can be analyzed to determine the formulation target of the data source policy. And (3) taking factors such as reliability, coverage, update frequency, data format and the like of the data into consideration to formulate a data source strategy.

Step S23, acquiring economic index data of a target city according to the data source strategy data, thereby acquiring an economic index data set, wherein the economic index data set comprises economic index data acquired by a plurality of different data sources;

Specifically, for example, economic index data may be obtained from the corresponding data sources according to the formulated data source policies. And cleaning and integrating the data of different data sources to ensure the accuracy and consistency of the data. The economic index data sets of different data sources are combined into a comprehensive economic index data set. The economic index data is stored and managed, and a database, a data warehouse or other tools can be used for data storage and indexing.

Step S24, reliability calculation is carried out on each data source in the economic index data set, so that reliability data corresponding to each data source is obtained, and the economic index data acquired by the data source corresponding to the reliability data lower than a preset reliability threshold is removed, so that a reliable economic index data set is obtained;

Specifically, for example, the reliability calculation and data cleansing may be performed by evaluating each data source, considering factors such as authority of the data source, credibility of the issuing authority, accuracy of the data, and the like, and assigning a reliability score to each data source. Or the data source credibility calculation formula calculates the credibility of each data source in the economic index data set to obtain the credibility data of each data source. A preset confidence threshold is set, for example, to 80 minutes (100 minutes full). For each data source in the economic index data set, checking its corresponding credibility data. If the reliability is lower than a preset threshold (for example, lower than 80 minutes), the economic index data corresponding to the data source is rejected.

And S25, marking the primary risk factors of the target financial products according to the trusted economic index data set, so as to obtain primary risk factor data.

Specifically, for example, a risk factor for a target financial product may be defined, and economic indicators associated with the target financial product may be determined based on demand and domain knowledge. For example, for a bond product, factors may include interest rate, economic growth rate, expansion rate, and the like. And extracting economic index data related to the target financial product risk factors according to the trusted economic index data set. And (3) primarily labeling the extracted economic index data, and classifying the economic index data into high risk factors, medium risk factors or low risk factors according to historical data or professional judgment. And (3) finishing the primary labeling result to obtain primary risk factor data, wherein the data comprises each economic index and the corresponding risk level.

According to the invention, the object for acquiring the data in the subsequent step can be provided with the basic data for the subsequent financial risk assessment by acquiring the target city for data acquisition. By formulating a data source strategy, a specific method and source of data acquisition can be determined, and the reliability and effectiveness of the acquired data are ensured. The data source policy data provides guidance for managing and controlling the data acquisition process, so that the efficiency and accuracy of data acquisition can be improved, and a reliable data basis is provided for subsequent risk assessment. By collecting the economic index data of the target city, various index data reflecting the economic condition of the city can be obtained. The economic index data is one of important bases for evaluating financial risks, and can be used for analyzing information in aspects of economic activities, employment conditions, industrial structures and the like of cities, so that a comprehensive market environment background is provided for financial risk evaluation. By performing a confidence calculation for each data source in the economic indicators dataset, the confidence level of each data source can be assessed and measured. The method is favorable for screening out data sources with higher credibility, so that the quality and reliability of data are improved. The economic index data acquired by the data source with the reliability lower than the preset threshold value is removed, so that the interference of unreliable data on financial risk assessment can be reduced, and the accuracy and reliability of the assessment result are improved. By analyzing and labeling the trusted economic indicator dataset, factors related to financial risk can be initially determined. These preliminary risk factor data are one of the important bases for assessing the risk level of financial products. By labeling the risk factors, basic information can be provided for subsequent risk assessment, and the risk level of the financial product can be further analyzed and assessed.

Preferably, step S24 performs reliability calculation on each data source in the economic indicator data set by using a data source reliability calculation formula, where the data source reliability calculation formula is as follows:

Wherein, C is the credibility of the data source, n is the number of the economic index data acquired by the data source, i is the serial number of the economic index data acquired by the data source, d _i is the deviation between the ith economic index data and the true value, pi is the circumference rate, sigma _i is the standard deviation of the ith economic index data, t _i is the acquisition time of the ith economic index data, e is the base number of natural logarithm, and x is any positive real number.

The invention constructs a data source credibility calculation formula, wherein the data source credibility calculation formula is formed byThe logarithmic deviation of the ith economic index data acquired by the data source is represented, wherein d _i represents the deviation of the data from a true value, sigma _i represents the standard deviation of the data, and pi represents the circumference ratio. The method can measure the accuracy and stability of the data, and the smaller the deviation of the data, the smaller the standard deviation and the larger the logarithmic deviation are, the more credible the data is represented, otherwise, the larger the deviation of the data, the larger the standard deviation and the smaller the logarithmic deviation are, and the more unreliable the data is represented. The logarithmic function can enable the ratio of the deviation of data to the standard deviation to be smoother and more reasonable, and avoid the occurrence of excessively large or excessively small extreme values. By passing throughThe integral of the acquisition time of the ith economic index data acquired by the data source is represented, wherein t _i represents the acquisition time of the data, e represents the base of the natural logarithm, and x represents an arbitrary positive real number. The time effectiveness of the data can be measured, the shorter the data acquisition time is, the smaller the integral is, the more fresh the data is, and on the contrary, the longer the data acquisition time is, the larger the integral is, the more stale the data is. The integration function can enable the influence of the data acquisition time to be smoother and more reasonable, and avoid the occurrence of excessive or insufficient extreme values. By passing throughRepresents an infinite limit, where x represents an arbitrary positive real number. This term may act as a correction factor that makes the effect of other terms in the formula more pronounced, independent of the term. The value of the term tends to be 1, indicating that the credibility of the data source is not affected by the term, but is primarily dependent on the deviation, standard deviation, and acquisition time of the data. The use of the limit function can make the formula more perfect and strict, and avoid undefined or nonsensical situations. The formula can comprehensively reflect the quality and reliability of the economic index data acquired by the data source, so that the credibility of the data source is evaluated, and effective basis and reference are provided for data mining and risk level marking. The method can consider a plurality of influencing factors such as data deviation, standard deviation, acquisition time and the like, so that the credibility of a data source is more objective and reasonable, and abnormal or accidental influence of individual data is avoided. The common mathematical constants such as the circumference ratio, the base number of natural logarithms and the like can be used, so that the credibility of the data source is more universal and standard, and the influence of different units or systems is avoided.

Preferably, step S25 comprises the steps of:

Step S251, performing enterprise basic information acquisition on a target city according to data acquisition target city data so as to acquire an enterprise basic information data set, wherein the enterprise basic information data set comprises a plurality of enterprise basic information data;

Specifically, for example, basic information of an enterprise, such as company name, registered capital, legal representatives, etc., may be acquired by accessing a website or using an API of an engineering data service provider. And monitoring notices and financial reports of the securities exchange to acquire financial data, revenue conditions, profits and other operation data of the marketing company. By accessing the enterprise database or using the business data provider's APIs, more extensive enterprise data is obtained, including financial data, business conditions, etc. of non-listing companies. It should be noted that the enterprise basic information data in the enterprise basic information data set and the enterprise operation data in the enterprise operation data set should be in one-to-one correspondence, that is, may be corresponding by a common identifier (such as an enterprise registration number).

Step S252, carrying out industry ecological division on the corresponding enterprises according to the enterprise basic information data set so as to acquire industrial ecological network data;

Specifically, for example, for each business in the business base information dataset, the business information to which it belongs may be extracted. And classifying the enterprises according to the industries by utilizing industry classification standards (such as national economy industry classification standards). And constructing an industrial ecological network based on the classification result, wherein enterprises are nodes, and relationships among the industries are edges in the network. The structure and characteristics of the industrial ecological network, such as indexes of degree centrality, medium centrality and the like of nodes, can be analyzed by using graph theory algorithms (such as traversal of the graph, shortest path algorithm and the like). The betting center (Betweenness Centrality) is an index for measuring the importance of nodes in the network. It measures the degree of importance of one node between connecting other nodes in the network. In particular, the betweenness centrality represents the frequency at which the node is traversed in the shortest path in the network, i.e. the mediation that the node plays on the shortest path between connecting other nodes. The method for calculating the betweenness centrality is to calculate the shortest path number between each pair of nodes in the network and calculate the ratio of the shortest path number passing through a certain node to all the shortest path numbers. A node with high betweenness centrality means that it has greater control or intermediation in information flow propagation, resource transfer or impact propagation in the network. In the above-described embodiments, by calculating the betting centrality of each node (i.e., enterprise) in the industrial ecological network, the importance and the mediating degree of the enterprise in the entire industrial ecology can be evaluated. Enterprises with high betweenness centrality may play a key role in industry ecology, and play an important role in information transmission, resource flow and the like. The index can help understand and analyze the structural characteristics of the industrial ecological network and identify enterprises having important influences on the whole network.

Step S253, constructing an enterprise ecological network for the industrial ecological network data so as to acquire enterprise ecological network architecture data;

specifically, for example, an enterprise may be constructed as a node based on the industrial ecological network data. And establishing connection edges between enterprises according to the relationships (such as cooperative relationships, competitive relationships and the like) between the enterprises. Social network analysis methods (e.g., node-degree centrality, clustering coefficients, etc.) may be used to analyze the topology and key nodes of the enterprise ecological network.

Step 254, mining the implicit economic operation rule of the trusted economic index data set according to the enterprise operation data set and the enterprise ecological network architecture data, thereby obtaining economic operation rule data;

Specifically, for example, the enterprise may be associated according to a collaboration relationship, a supply chain relationship, and the like in the enterprise ecological network architecture data, so as to form an enterprise network. And analyzing the association rule between the enterprise operation data and the economic index data. An association rule mining method can be adopted to find out implicit rules and correlations between enterprise business data and economic index data. For example, the Apriori algorithm may be used to mine frequent item sets to find associated item sets of business data and economic indicator data. The strength and importance of the association rule may be determined using association rule evaluation indicators such as support, confidence, boost, etc. And (5) performing cluster analysis. And clustering enterprises according to the similarity of the business data and the economic index data to reveal economic operation modes and rules among different enterprise groups. The K-means clustering algorithm or hierarchical clustering algorithm can be used for clustering enterprises to form different economic operation modes. The economic operation rule can be deduced by analyzing the characteristics and behaviors of different enterprise groups based on the clustering result. Analysis and interpretation of the results. And analyzing interaction, dependency and relationship with economic indexes among enterprises according to the mining result to reveal rules and trends of economic operation.

And S255, marking the primary risk factors of the target financial products according to the economic operation rule data, so as to obtain the primary risk factor data.

Specifically, for example, an economic index related to a risk factor may be selected as a feature according to the characteristics of a target financial product and the purpose of research. For example, GDP growth rate, inflation rate, and interest level are selected as representatives of risk factors. And marking the historical performance of the target financial product based on the selected characteristics. For example, the risk factors for each period are labeled as high risk, medium risk, or low risk based on changes in GDP growth rate, inflation rate, and interest level. This may be automatically noted by setting a threshold or using a machine learning algorithm. And carrying out data analysis and statistics according to the marked risk factor data. Statistical methods (e.g., mean, variance, correlation, etc.) may be used to analyze the relationships between different risk factors and to initially evaluate the risk characteristics of the target financial product. And according to the data analysis result, explaining the preliminary risk factors of the target financial products. For example, if a high inflation rate is found to correlate well with the risk level of the fund, a preliminary conclusion may be drawn that the inflation rate is one of the important risk factors for the fund.

The invention can acquire the data set comprising the basic information data and the business operation data of the enterprises by carrying out basic information acquisition and business condition monitoring on the enterprises in the target cities. Such data is critical to understanding the enterprise structure, scale, nature, and business conditions of the target city. The business base information data may provide base information about business names, registered capital, legal representatives, etc., while the business operation data may provide data about financial status, sales, profitability, etc., of the business. Such data facilitates the comprehensive assessment of economic activity in the target city by interested parties such as financial institutions, investors, etc., thereby making corresponding investment decisions and risk management measures. By analyzing and processing the enterprise basic information data set, the corresponding enterprise can be subjected to industry ecological division, so that the industrial ecological network data are obtained. The data may reveal collaboration, supply chain, competition, etc. among different enterprises, thereby constructing a network structure for the target city industry ecology. The industrial ecological network data has important significance for understanding the industrial layout of the target city and the interaction among key enterprises, and the industry development trend. Such data facilitates financial institutions and decision makers in making industry, market forecasts, and risk assessment, promoting industry development and promoting economic growth. Through processing and analyzing the industrial ecological network data, an enterprise ecological network can be constructed, so that the enterprise ecological network architecture data can be obtained. The enterprise ecological network architecture data can reveal association relationships, cooperation modes, dependency relationships and the like among different enterprises. Such data is of great value for understanding the structure and operation of the enterprise ecosystem of the target city. For financial institutions and investors, the enterprise ecological network architecture data can provide a more comprehensive enterprise relationship view, and help them understand the stability, vulnerability and potential risk of the enterprise ecological system of the target city to make corresponding risk management and decision-making measures. By comprehensively analyzing the enterprise operation data set and the enterprise ecological network architecture data, the implicit economic operation rule can be mined, so that the economic operation rule data can be obtained. These data may reveal operational laws, trends, and key factors of the target city economy. By mining the economic operation rule data, financial institutions and investors can be helped to better understand the economic development condition of the target city, predict the economic trend and find potential risks and opportunities. Such data is of great importance for the formulation of investment strategies and risk management. The preliminary risk factor data can be obtained by marking the target financial product with the economic operation rule data. The preliminary risk factor data may reveal potential risk and uncertainty factors faced by the target financial product. Such data may provide a more comprehensive basis for risk assessment and decision making for financial institutions and investors. By knowing the preliminary risk factor data, the financial institution may take corresponding risk management measures and investors may make more informed investment decisions, thereby reducing risk and increasing return on investment.

Preferably, step S254 includes the steps of:

Step S2541, carrying out endangered enterprise analysis on the enterprise operation data set so as to acquire endangered enterprise data;

Specifically, for example, the enterprise management data set may be analyzed based on definitions and metrics of endangered enterprises, such as financial metrics (profit margin, debt repayment capability, etc.), market metrics (market share, sales growth rate, etc.), competitive metrics (industry status, market prospects, etc.). The enterprise is classified as an endangered enterprise using suitable methods and algorithms, such as machine learning based classification algorithms (e.g., decision trees, support vector machines, etc.) or rule engines. And determining an endangered enterprise set according to the classification result, and acquiring endangered enterprise data.

Step S2542, labeling the endangered enterprises corresponding to the enterprises in the enterprise ecological network architecture data according to the endangered enterprises data, so as to obtain the advanced enterprise ecological network architecture data;

Specifically, for example, the endangered enterprise data can be matched with the enterprise ecological network architecture data to find out the position of the endangered enterprise in the ecological network. Enterprise econetwork architecture data describing partnerships, supply chain relationships, etc. between enterprises. For endangered enterprises and related enterprises, labeling the endangered enterprises can be realized by using professional software or algorithms, such as a graph network analysis algorithm or a community discovery algorithm. After the labeling is completed, the high-level enterprise ecological network architecture data is obtained, wherein the data comprises the labeling information of the endangered enterprises, and the labeling information can be used for subsequent analysis and mining.

Step S2543, carrying out enterprise business model analysis on the corresponding enterprise according to the enterprise business data set so as to obtain the enterprise business model data set;

specifically, for example, the business model of each enterprise may be analyzed using the enterprise business data set. The business model may include aspects of product portfolio, market location, sales channels, operational policies, and the like. Enterprise business data is analyzed and mined using appropriate methods and techniques, such as cluster analysis, association rule mining, etc., to discover business model differences and commonalities between different enterprises. And generating corresponding enterprise operation mode data for each enterprise according to the analysis result, wherein the enterprise operation mode data can be in the forms of classification labels, association rule sets and the like and is used for subsequent economic operation rule mining.

And step S2544, carrying out implicit economic operation rule mining on the trusted economic index data set according to the enterprise operation mode data set and the advanced enterprise ecological network architecture data, thereby obtaining economic operation rule data.

Specifically, for example, an integrated data set may be constructed in combination with the enterprise business model data set and the advanced enterprise ecological network architecture data, where the integrated data set includes enterprise business model data, enterprise relationship data, and economic index data. The integrated data set is analyzed and mined using appropriate data mining techniques and algorithms, such as correlation analysis, time series analysis, machine learning, etc., to reveal economic operational laws. And extracting and summarizing economic operation rule data according to analysis results, wherein the economic operation rule data can be in the forms of statistical indexes, trend modes, association rules and the like and is used for understanding and predicting economic operation.

By analyzing the endangered enterprise operation data set, the method can identify the enterprise in the endangered state and acquire corresponding endangered enterprise data. Such data helps to understand the operational status, financial health, and risks and challenges that an endangered business may face. Such analysis may assist interested parties, such as financial institutions, investors, etc., in taking appropriate measures, such as providing loan assistance, performing risk management, or taking rescue measures to maintain financial stability and promote economic development. By combining the endangered enterprise data with the enterprise ecological network architecture data, endangered enterprises in the enterprise ecological network can be marked, so that the advanced enterprise ecological network architecture data can be obtained. Such data helps reveal associations, partnerships, and dependencies between enterprises. For stakeholders such as financial institutions, investors, etc., this data may provide a more comprehensive view of the ecological network of the enterprise, helping them better understand the impact and potential conductive effects of endangered enterprises on the overall economic system, and taking corresponding risk management and decision making measures. The enterprise business model data set may be obtained by conducting enterprise business model analysis on the enterprise business data set. The data can reveal key elements such as business strategies, market positioning, profit patterns, business models and the like of enterprises. Knowledge of the business model of an enterprise may help the financial institutions, investors, and decision makers to assess the competitiveness, profitability, and sustainability of the enterprise to make more accurate investment decisions and risk assessments. By combining the enterprise operation mode data set and the advanced enterprise ecological network architecture data, the economic operation rule data can be obtained by carrying out implicit economic operation rule mining on the trusted economic index data set. Such data may help reveal potential laws, associations, and trends in the economic system. Knowledge of the laws of economic operation may provide further insight to financial institutions and decision makers, helping them to formulate economic predictions and manage risks, thereby promoting economic development and maintaining financial stability.

Preferably, step S2544 includes the steps of:

Step S25441, carrying out asset liability list analysis on each enterprise operation mode data in the enterprise operation mode data set so as to obtain an enterprise financial structure data set, wherein the enterprise financial structure data set comprises enterprise financial structure data corresponding to each enterprise;

specifically, for example, asset liability statement data for each enterprise may be extracted from the enterprise management dataset, including items such as assets, liabilities, and owner interests. Based on the liability sheet data, a series of financial ratios, such as flow ratio, snap-action ratio, liability ratio, capital structure ratio, etc., are calculated. These ratios reflect the financial status and structure of the enterprise. And cleaning the calculated financial ratio data, removing abnormal values and missing values, and ensuring the accuracy and the integrity of the data. And carrying out normalization processing on the financial ratio data to ensure that the financial ratio data have the same scale and range, and avoiding different influence degrees of different ratios on analysis results. The financial ratio data corresponding to each business is organized into a business financial structure data set, wherein each business contains a set of financial ratio data describing its financial structure.

S25442, carrying out enterprise profit capability analysis on the corresponding enterprise according to the enterprise financial structure dataset so as to obtain the enterprise profit capability dataset;

specifically, for example, a series of profitability metrics, such as gross, net, asset return, etc., may be calculated based on the enterprise financial structure dataset. These metrics reflect the profitability and efficiency of the enterprise. And arranging the profitability index data corresponding to each enterprise into an enterprise profitability data set, wherein each enterprise contains a group of profitability index data and describes the profitability condition of the enterprise.

Step S25443, performing enterprise intelligent risk assessment on the corresponding enterprise according to the enterprise operation mode dataset and the enterprise profitability dataset to obtain an enterprise intelligent risk feature dataset;

specifically, for example, an intelligent risk assessment model may be built using machine learning algorithms (e.g., decision trees, random forests, etc.) based on the enterprise business model dataset and the enterprise profitability dataset. And carrying out intelligent risk assessment on each enterprise by using the established model to obtain a risk assessment result. And (3) arranging the intelligent risk assessment results of each enterprise into an enterprise intelligent risk feature data set, wherein each enterprise contains the risk assessment results and describes the intelligent risk features of each enterprise. The enterprise ecological network architecture data comprises information such as association relation and connection mode among enterprises. And taking the enterprise as a node, and constructing a network graph according to the association relation between the enterprise and the node. A graph theory library (e.g., networkX) may be used to create and manipulate the network graph. And analyzing the enterprise ecological network by using a topological method and algorithm to reveal the structure and characteristics of the network. The indexes such as degree centrality, medium centrality, compactness and the like of the nodes can be calculated so as to evaluate the importance and influence of the nodes in the network. Community structure may be detected, sub-networks with tight connections identified, revealing associations and interactions between different enterprises. And extracting key topological structure data, such as a centrality index of the node, a community dividing result and the like, according to a topological analysis result. These data are consolidated into an enterprise ecological network topology data set, each enterprise containing its location and characteristic information in the network. It should be noted that the specific method and algorithm of topology analysis may be selected according to actual requirements, for example, using a graph theory-based algorithm (such as a shortest path algorithm, a clustering algorithm, etc.) or a complex network analysis method (such as a small world network, a scaleless network, etc.). The choice of appropriate tools and techniques to achieve topology resolution, as well as the specific form and structure of the data set, also requires decision making and adjustment according to the specific circumstances.

Step S25444, determining a target economic index for the trusted economic index dataset, so as to obtain target economic index data;

In particular, for example, the indicators related to the economic operation may be selected from a trusted economic indicator dataset, and the calculation methods and data sources of these indicators may be determined, ensuring their accuracy and reliability. And cleaning the selected economic index data, removing abnormal values and missing values, and ensuring the accuracy and the integrity of the data. The data of different indexes are organized into an economic index data set, each index corresponds to a column of data, and each row represents the data of a time point.

S25445, classifying and marking the credible economic index dataset according to the target economic index data, so as to construct an economic index library;

Specifically, for example, each data point in the trusted economic indicator dataset may be classified based on the economic indicator data, such as an economic growth period, an economic decay period, a stationary period, and the like. Labeling may be performed using supervised learning algorithms (e.g., decision trees, support vector machines, etc.) or time series analysis methods (e.g., ARIMA model, exponential smoothing, etc.). And (3) sorting the classified and marked credible economic index data set into an economic index library, wherein each data point comprises an economic index value and a corresponding classified mark.

Step S25446, extracting an economic operation implicit dependency relationship structure from the economic index library so as to obtain an economic operation implicit dependency structure table;

Specifically, for example, implicit dependencies between different economic indicators may be extracted using correlation analysis or causal relationship analysis or the like based on the economic indicator data. The analysis may be performed using statistical methods (e.g., correlation coefficients, gray correlation, etc.) or machine learning methods (e.g., causal graph models, bayesian networks, etc.). And arranging the extracted implicit dependency relationship into an economic operation implicit dependency structure table, and recording the dependency relationship and strength among different economic indexes.

And step 25447, carrying out economic operation rule mining on the trusted economic index dataset based on the enterprise ecological network topological structure data and the economic operation implicit dependency structure table so as to obtain economic operation rule data.

Specifically, for example, the economic operation rule data may be mined using a data mining technique (such as association rule mining, time series analysis, etc.) based on the enterprise ecological network topology data and the economic operation implicit dependency structure table. The association rule among different economic indexes can be found through an association rule mining method, or the trend and change of future economic indexes can be predicted through a time sequence analysis method. And (3) arranging the economic operation rules obtained by excavation into a data set or a report so as to show the relation and trend among different economic indexes and help a decision maker to make corresponding economic policy adjustment and decision.

The invention can acquire the enterprise financial structure data set by analyzing the asset liability list of the enterprise operation mode data. The corporate financial structure data is an important indicator of corporate financial status, including data in terms of assets, liabilities, and owner equity. Such data is of great importance for understanding the financial health, asset allocation and liability structure of an enterprise. Financial institutions, investors, and the like can utilize this data to assess financial risk, funds performance and repayment capabilities of an enterprise to formulate corresponding risk management and decision-making measures. Through the analysis of the enterprise financial structure data set, enterprise profitability analysis can be performed, and the enterprise profitability data set is obtained. The enterprise profitability data may reveal the profitability level, profit organization, and profitability index of the enterprise. Such data is of great importance in assessing the business condition, profitability and competitiveness of an enterprise. Financial institutions, investors, can use this data to assess the profitability, trends and competitive status of an enterprise to make corresponding investment decisions. By comprehensively analyzing the enterprise business model data set and the enterprise profitability data set, the enterprise intelligent risk assessment can be performed, so that the enterprise intelligent risk characteristic data set is obtained. The enterprise intelligent risk feature data may reveal risk types, risk levels, and risk features that the enterprise is exposed to. Such data may provide a more comprehensive basis for risk assessment and decision making for financial institutions, investors. By analyzing the enterprise intelligent risk characteristic data set, topology structure data of the enterprise ecological network can be constructed, and the data reveals association relations, cooperation modes and dependency relations among different enterprises. Such data is of great significance in understanding the stability, vulnerability and potential risk of the enterprise ecosystem, and helps to formulate corresponding risk management and decision-making measures. By analyzing the trusted economic indicator data set, a target economic indicator can be determined, thereby obtaining target economic indicator data. The target economic index is an important index for measuring economic development and economic performance. Determining the target economic index can help stakeholders such as financial institutions and research institutions to definitely measure important aspects and targets of economic development, and provide quantitative evaluation and comparison basis. Has important significance for making economic policy, monitoring economic condition and evaluating economic development trend. By classifying and marking the trusted economic index data set according to the target economic index, an economic index library can be constructed. The economic index library is a database integrating various economic index data, and is organized and stored according to different categories, dimensions and time. Constructing an economic index library can facilitate economic analysis, formulation and decision support. The stakeholders can acquire various economic index data through the economic index library and conduct data analysis, comparison and research, so that the development condition, trend and influence factors of the economy are better known, and scientific basis is provided for decision making and planning. By analyzing and processing the economic index library, the implicit dependency structure of the economic operation can be extracted, so that an economic operation implicit dependency structure table is obtained. The economic operation implicit dependency structure table reveals interrelationships, influence mechanisms and dependency degrees among different economic indicators. Such information is of great importance for understanding the complexity of the economic system, identifying key indicators and predicting economic trends. The extraction of the implicit dependency structure of the economic operation can help stakeholders to better understand the mechanism and rule of the economic operation, and provide scientific basis for risk management and decision support. By analyzing and processing the economic index library, the implicit dependency structure of the economic operation can be extracted, so that an economic operation implicit dependency structure table is obtained. The economic operation implicit dependency structure table reveals interrelationships, influence mechanisms and dependency degrees among different economic indicators. Such information is of great importance for understanding the complexity of the economic system, identifying key indicators and predicting economic trends. The extraction of the implicit dependency structure of the economic operation can help stakeholders to better understand the mechanism and rule of the economic operation, and provide scientific basis for risk management and decision support.

Preferably, step S3 comprises the steps of:

Step S31, enterprise management data acquisition is carried out on enterprises to which the target financial products belong, so that product enterprise management data are obtained;

In particular, trusted data sources, such as business annual reports, financial reports, public disclosure documents, and the like, may be determined for collecting business administration data, for example. Data related to enterprise governance, such as board member information, equity structure, internal control regimes, etc., is extracted from the selected data sources. The data may be collected and consolidated manually or obtained from a public data platform using automated tools and crawler technology. And carrying out data format conversion, naming unification and other processing on the collected enterprise management data so as to facilitate subsequent analysis and use.

S32, performing asset association network analysis on enterprises to which the target financial products belong according to the product enterprise management data so as to acquire asset association network data;

Specifically, for example, the asset data of the enterprise to which the target financial product belongs, such as a subsidiary, a equity, an investment project, etc., may be collected using information provided in the product enterprise governance data. Corresponding asset association information can be obtained through channels such as enterprise annual reports, financial reports, business data and the like. And constructing an asset association network of an enterprise to which the target financial product belongs based on the collected asset association data. This may be represented by establishing a relationship of nodes and edges, where nodes represent enterprises or assets and edges represent associations between them. And analyzing the constructed asset association network and exploring asset relationship and connection modes in the enterprise. Graph theory algorithms (e.g., social network analysis, connectivity analysis, etc.) may be used to reveal the degree of association between assets, network characteristics of key nodes, etc.

S33, performing core technology talent team constituent analysis on an enterprise to which a target financial product belongs, so as to acquire talent constituent data;

Specifically, for example, talent information of an enterprise to which the target financial product belongs may be collected through a human resource department or a related database inside the enterprise. This includes key information about the core technician's title, academic, work experience, etc. And (5) carrying out composition analysis of the core technology talent team by using the collected talent data. The personnel number of different titles, academia or working experience can be counted, and the talent structure and distribution condition of the enterprise can be known. And exploring the association relationship among the talents of the core technology, such as the cooperation relationship of personnel, the sharing of experience and the like. Social network analysis methods (e.g., co-occurrence analysis, relationship strength analysis, etc.) may be used to reveal the degree of association between talents and key people.

Step S34, carrying out patent data deep mining on enterprises to which target financial products belong, thereby obtaining patent technology data;

Specifically, for example, patent information of an enterprise to which the target financial product belongs may be searched and acquired using a patent database (e.g., a patent search system, a patent database platform, etc.). The patent data set related to the enterprise can be obtained by searching according to the conditions of the name, keywords, technical fields and the like of the enterprise. And cleaning and arranging the acquired patent data, and removing repeated, invalid or irrelevant patent information. The patent classification, keyword extraction and other technical means can be used for screening and classifying the patent, so that the accuracy and usability of the data are ensured. Technical analysis is carried out on the cleaned and tidied patent data, and the technical field, innovation key points and the like of the patent of enterprises are explored. The key information in the patent text, such as technical keywords, patent citation relations and the like, can be extracted by using technical methods such as text mining, natural language processing and the like.

Step S35, carrying out technical innovation capability assessment on enterprises to which the target financial products belong according to talent composition data and patent technology data, thereby obtaining technical innovation capability assessment data;

Specifically, for example, a series of indicators reflecting the capability of technical innovation may be defined based on talent construction data and patent technology data. The metrics may include patent quantity, patent quality, technician quantity and structure, technical partnership, etc. And integrating talent composition data and patent technology data to establish a comprehensive technical innovation capability assessment data set. The indexes of different data sources can be uniformly weighted or normalized by using data fusion and data processing technology. Based on the integrated data set, an evaluation model or algorithm is applied to evaluate the technical innovation capability of the enterprise to which the target financial product belongs. A multidimensional evaluation method such as an index weighting method, an analytic hierarchy process and the like can be adopted to comprehensively consider the importance and the weight of each index.

And step S36, evaluating the risk resistance of the enterprise to which the target financial product belongs according to the asset association network data and the technical innovation capability evaluation data, so as to acquire enterprise comprehensive risk resistance data.

Specifically, for example, a series of indicators reflecting the risk resistance of an enterprise may be defined based on asset association network data and technical innovation capability assessment data. The metrics may include asset diversity, technology innovation capability versus risk resistance, critical talent stability, etc. Integrating the asset association network data with the technical innovation capability assessment data to establish a comprehensive risk resistance capability assessment data set. The indexes of different data sources can be uniformly weighted or normalized by using data fusion and data processing technology. Based on the integrated data set, an evaluation model or algorithm is applied to evaluate the risk resistance of the enterprise to which the target financial product belongs. Comprehensive evaluation methods such as an index weighting method, a hierarchical analysis method and the like can be adopted to comprehensively consider the importance and the weight of each index.

The invention can know the conditions of the management structure, decision mechanism, internal control and the like of the enterprise to which the target financial product belongs by collecting the product enterprise management data, and provides an important basis for evaluating the stability and the sustainability of the enterprise. This helps investors, regulatory authorities and other stakeholders to determine the corporate governance risk and potential problems and make corresponding decisions. Asset association network analysis may reveal the asset structure, associations, and risk conduction paths of an enterprise. By analyzing the asset association network of the enterprise to which the target financial product belongs, the core asset, the dependency relationship and the risk exposure of the enterprise can be identified. This helps to understand the asset quality, liquidity risk, and systematic risk of the enterprise, providing a reference for investors and regulatory authorities to assess the risk bearing capacity of the enterprise. The core technology talents are important driving factors for enterprise innovation and competitiveness. By analyzing the core technical talent constitution of the enterprise to which the target financial product belongs, the technical strength, innovation capability and talent reserve condition of the enterprise can be known. This helps to assess the technical competitiveness, long-term development potential and innovation ability of an enterprise, providing an important reference for investment decision making and strategic planning. The patent is an important embodiment of the innovation result of enterprises, and has important significance on the technical strength and competitive advantage of the enterprises. The technical field, technical layout and innovation activity condition of the enterprise can be known by deeply mining the patent data of the enterprise to which the target financial product belongs. This helps to assess the technical strength, technical barriers, and technical innovation capabilities of the enterprise, providing investors and decision makers with important information about the technical value and development potential of the enterprise. By integrating talent composition data and patent technology data, the technical innovation capability of an enterprise to which a target financial product belongs is evaluated, and innovation strength, technical research and development investment and innovation achievements of the enterprise can be known. The method is helpful for evaluating the leading position of enterprises in the technical field, the sustainability of innovation capability and the future growth potential, and provides key judgment indexes for investors and decision makers. The comprehensive risk resistance of the enterprise to which the target financial product belongs can be evaluated by integrating the asset association network data and the technical innovation capability evaluation data. This helps to understand the strength of the risk tolerance, systemic risk exposure, and anti-risk capability of the enterprise. The assessment results may provide important references to investors, regulatory authorities, and other stakeholders to assist them in making risk management and decisions.

Preferably, step S4 comprises the steps of:

step S41, enterprise operation data acquisition is carried out on a target financial product, so that an enterprise operation data set is obtained;

In particular, for example, it may be determined from which data sources enterprise operational data, such as financial reports, market data, industry reports, and the like, are collected. The data may be obtained from public data sources, financial institution data, or an internal database, etc. The enterprise operation data is captured in a suitable manner, such as through an API interface, a crawler program, or manual collection. Ensuring the accuracy and integrity of the data.

Step S42, carrying out structuring treatment on the enterprise operation data set so as to obtain a structured enterprise operation data set;

Specifically, for example, unstructured or semi-structured enterprise operational datasets may be converted to structured data, such as text data for text mining and natural language processing, picture data for image recognition and feature extraction, and so forth. The processed data is arranged in a certain format for subsequent analysis and modeling. For example, the data is organized into a tabular form, with each row representing one sample and each column representing one feature.

S43, inputting the structured enterprise operation data set into a preset convolutional self-encoder deep learning model, and automatically extracting high-dimensional features of the structured enterprise operation data set through an unsupervised feature learning algorithm so as to obtain product management risk feature data;

Specifically, for example, an appropriate convolutional self-encoder depth learning model may be selected for feature learning and feature extraction. An existing model architecture, such as Convolutional, autoencoder, may be selected. And taking the structured enterprise operation data set as input data, and ensuring that the data format is matched with the model requirement. Training a convolutional self-encoder deep learning model by using a structured enterprise operation data set, learning data features in an unsupervised learning mode and extracting high-dimensional feature representations. And performing feature extraction on the structured enterprise operation data set by using the trained convolution self-encoder model to obtain product management risk feature data. These features may be the output of the hidden layer in the model or the output of the encoder.

Step S44, constructing GBDT gradient lifting decision tree models, and inputting the product management risk feature vectors as input sample features into GBDT gradient lifting decision tree models for training so as to obtain product management risk rating models;

Specifically, for example, a training data set including product management risk feature vectors and corresponding product management risk rating labels may be prepared. And ensuring the accuracy of the data and the consistency of the labels. Parameters of GBDT models, such as the number of trees, the depth of the trees, the learning rate, etc., are set. These parameters may be optimized according to the particular problem and data set. And taking the product management risk feature vector as an input sample feature, taking the product management risk rating label as an output label, and training the GBDT model by using a training data set. In the training process, the model gradually fits training data in an iterative mode, and the prediction effect is continuously improved. The trained GBDT model is evaluated by using the verification dataset, and evaluation indexes (such as accuracy, recall, F1 value and the like) are calculated to measure the performance of the model. The trained GBDT models are saved for later use and prediction.

And S45, inputting the product management risk feature vector into a product management risk rating model to perform feature mapping and management risk prediction, so as to acquire product potential risk data.

Specifically, for example, product management risk feature vector data to be predicted may be prepared. And inputting the product management risk feature vector to be predicted into a trained GBDT model, and performing feature mapping through the model to obtain corresponding potential risk features. And inputting the mapped features into a rating model, and predicting the management risk by using the model. The model gives corresponding risk rating results according to the input characteristics. And obtaining potential risk data of the product according to the operation risk prediction result. May be a specific value of risk rating or a risk rating classification.

The invention can acquire the enterprise operation data set by collecting the operation data of the enterprise associated with the target financial product. Enterprise operational data is a variety of data generated by an enterprise during business activities, including financial data, sales data, production data, and the like. Such data is of great importance in assessing business conditions, performance and risk levels of an enterprise. By collecting enterprise operation data, a comprehensive and accurate data base can be established, and support is provided for subsequent data processing and analysis. By structuring the enterprise operation data set, the original data can be converted into a structured data form, so that the data has better organization, readability and analyzability. Structured enterprise operating datasets can provide clearer and more easily understood data representations for facilitating subsequent data analysis and modeling. The structured data set can also be used for data cleaning, noise removal, abnormal value removal and other treatments, so that the quality and accuracy of the data are improved. By using a convolutional self-encoder deep learning model, the automatic extraction of high-dimensional features from a structured enterprise operating dataset can be performed. Such an unsupervised feature learning algorithm can learn and discover potentially important features from the data without manually defining feature engineering. By extracting the product management risk characteristic data, potential risk factors in enterprise operation can be revealed, and a foundation is provided for subsequent risk rating and risk prediction. By constructing GBDT gradient lifting decision tree models, the business risk of the enterprise can be estimated and predicted by using the product business risk feature vectors. The GBDT model is a powerful machine learning algorithm that can effectively capture nonlinear relationships and interactions between features. By training the product management risk rating model, enterprises can be classified and rated according to the product management risk feature vectors, and financial institutions, investors and decision makers are helped to better understand and evaluate the management risks of the enterprises. Feature mapping and business risk prediction can be performed by inputting the product business risk feature vector into the product business risk rating model. Thus, potential risk data based on the product management risk characteristics, namely a prediction result of enterprise management risk, can be obtained. Such data may provide information about the potential risk level of the product and the challenges that may be faced. By predicting and assessing the operational risk of the product, financial institutions, investors and decision makers can better understand the risk status of the product, thereby taking appropriate measures to manage risk, make policies and make decisions.

Preferably, step S5 comprises the steps of:

Step S51, risk sensitivity analysis is carried out on the potential risk data of the product, so that network key node factor data are obtained;

In particular, for example, the sensitivity of different risk factors may be analyzed using statistical analysis methods or machine learning algorithms based on product risk potential data. The contribution degree of the risk factors to the product risk can be measured by using methods such as correlation coefficients, analysis of variance, regression analysis and the like. And determining risk factors with larger influence according to the result of the risk sensitivity analysis, and taking the risk factors as network key node factors. Ranking or thresholding methods may be used to determine network key node factors.

S52, carrying out comprehensive risk factor network construction on the preliminary risk factor data and the network key node factor data so as to obtain product risk factor network architecture data;

Specifically, for example, the preliminary risk factor data and the network key node factor data can be integrated to construct a product risk factor network. A risk factor network may be constructed using graph theory algorithms and network analysis tools (e.g., gephi) in which nodes represent risk factors and edges represent relationships between factors. And acquiring architecture data of the product risk factor network, including node attributes, edge weights and the like, according to the construction result of the comprehensive risk factor network.

Step S53, acquiring a market fluctuation data set;

Specifically, for example, market-related data may be obtained from data sources such as financial data suppliers, exchanges, and the like. Stock index data, futures price data, foreign exchange data, etc. are collected, which reflect market fluctuations. The real-time or historical market volatility data may be obtained using an API interface or a data subscription service.

Step S54, carrying out risk propagation simulation on the product risk factor network architecture data according to the market fluctuation data set so as to acquire risk propagation path data;

specifically, for example, an appropriate risk propagation simulation method such as monte carlo simulation, system dynamics model, or the like may be selected. Based on the market volatility dataset, the propagation relationship between different risk factors is simulated. And taking market volatility data as input, and performing risk propagation simulation on the product risk factor network architecture data. Random sampling, simulation and other techniques can be used to simulate the risk propagation process, and the influence of market volatility on different risk factors is considered. In the risk propagation simulation process, risk propagation path data including information such as a state change of each node, a time series of the risk propagation path, and the like are recorded and acquired.

Step S55, risk propagation influence analysis is carried out on the risk propagation path data, so that risk propagation influence data are obtained;

specifically, for example, an appropriate risk propagation impact analysis method such as node centrality analysis, network connectivity analysis, or the like may be selected. These methods can help determine which nodes in the risk propagation path have a greater impact on the risk propagation of the overall system. Based on the risk propagation path data, the propagation influence of each node is calculated and evaluated. Analysis may be performed using graph theory algorithms, network analysis tools, etc., to determine key nodes and propagation paths.

Step S56, final risk grade marking is carried out on the target financial product by utilizing the risk transmission influence data, so that final risk grade data is obtained;

Specifically, for example, an appropriate final risk level labeling method may be selected according to business needs and risk assessment criteria. The final risk level may be determined from the risk spread impact data and pre-set evaluation rules. And labeling the final risk level of the target financial product based on the risk transmission influence data and the evaluation rule. The product can be marked by using a classification algorithm, a decision tree and other methods, and the product is classified into different risk grades.

And step 57, performing multi-dimensional visualization processing on the final risk level data so as to obtain a product risk level chart.

Specifically, for example, an appropriate data visualization tool, such as Python Matplotlib, tableau, powerBI, may be selected. Based on the final risk level data, a product risk level chart is generated using the selected data visualization tool. The method can draw a bar chart, a pie chart, a radar chart and other forms, and visually display products with different risk levels. Labels, color coding and other modes can be added, so that the readability of the chart and the information transmission effect are enhanced.

According to the risk sensitivity analysis method, the sensitivity factors of the potential risks of the product, namely factors having important influence on the risks of the product, can be identified through the risk sensitivity analysis. The key node factor data can help to know main sources and key factors of product risks, and provide important basis for subsequent risk assessment and control. Through the comprehensive risk factor network construction, preliminary risk factor data and key node factor data can be combined to construct a comprehensive network architecture of the product risk factors. These data can help to understand the interrelationship and propagation paths between different risk factors to more fully assess the risk level and potential risk of the product. The comprehensive risk factor network architecture data provides an important basis for subsequent risk propagation simulation and risk level annotation. By acquiring the market volatility dataset, the impact of dynamic changes in the market on financial product risk can be understood. The data can help to know the change trend of the market risk and the path of risk propagation, and provide important basis for subsequent risk propagation simulation and influence analysis. Through risk propagation simulation, the propagation path of the product risk in the market can be simulated and predicted. The data can help to know the spreading effect and potential influence of the product risk in the market, and provide important basis for subsequent risk spreading influence analysis and risk grade marking. Through risk propagation influence analysis, the influence degree of different propagation paths on the product risk can be evaluated. The data can help to know the transmission effect and potential influence of different transmission paths on the product risk, and provide important basis for subsequent risk level labeling. The risk transmission influence data can quantify influence of product risk transmission, and a more accurate and comprehensive evaluation basis is provided for risk grade labeling. By comprehensively considering the influence of risk spread, the risk degree of the financial product can be estimated more accurately. The risk propagation influence data provides quantitative information of the risk propagation path and the propagation effect, and is helpful for determining the accuracy and reliability of the risk level. By marking the risk level, the influence degree of different risk factors on the financial products can be quantified. This helps the decision maker to better understand the importance and potential impact of different risk factors, providing basis for making corresponding risk management policies and decisions. By marking the risk grades of different financial products, the risk degrees of different products can be compared transversely. This helps the decision maker to understand the risk characteristics and risk exposure of different products for effective risk management and selection of the appropriate portfolio. The risk level labeling result can provide important reference information for decision makers and help the decision makers to formulate risk control strategies and management measures. Based on accurate risk level assessment, a decision maker can better understand the risk characteristics and potential risks of the product, thereby making informed decisions and taking appropriate risk management measures. The final risk level data is subjected to multi-dimensional visualization processing, so that a risk level chart of the product can be intuitively presented. These charts can help the decision maker to better understand and analyze the risk situation of the product to make corresponding decisions and measures. The multidimensional visualization processing can provide more visual and easily understood product risk level information, and the accuracy and efficiency of decision making are improved.

Preferably, step S6 comprises the steps of:

Step S61, carrying out target financial product account related record collection on an investor so as to acquire account record data of the investor;

Specifically, for example, the source of the investor account record data, such as a financial institution, such as a bank, dealer, foundation company, etc., may be determined. It is determined whether a data interface needs to be established with an associated financial institution or a data license needs to be acquired. The investor account record data is obtained through a data interface with a financial institution or other legal means. Data related to the target financial product is collected including account transaction records, funds running, inventory holding information, and the like.

Step S62, performing high-frequency transaction detection on the investor account record data so as to acquire investor high-frequency transaction data;

Specifically, for example, the definition of the high frequency transaction, such as the number of transactions per unit time or the frequency of transactions, etc., may be determined according to the business requirements and regulatory requirements. An appropriate high frequency transaction detection method is selected, such as statistical analysis based on transaction frequency, algorithm model, etc. High frequency transaction detection may be performed using techniques such as data mining, machine learning, and the like. And carrying out high-frequency transaction detection on the investor account record data, and screening out transaction records conforming to the high-frequency transaction definition. The method can be used for detecting the high-frequency transaction and identifying the high-frequency transaction behavior by using a time window, a statistical index and the like.

Step S63, carrying out asset data statistics and summarization on investors so as to obtain the asset structure data of the investors;

Specifically, for example, the investors' assets may be counted and aggregated according to the running water of funds and the inventory holding information in the investor account record data. The market value or fund duty of the holding of different types of assets (e.g., stocks, bonds, funds, etc.) is counted. Based on the asset structure data, an analysis is performed of the personal asset profile of the investor. The specific gravity of different types of assets in the total asset can be calculated, and a pie chart or a bar chart is drawn to show the personal asset distribution. And acquiring the personal asset distribution data according to the result of the personal asset distribution analysis. The personal assets can be classified according to different asset types, different market value ranges and the like, and corresponding data are obtained.

S64, carrying out transaction behavior pattern recognition on the high-frequency transaction data of the investors and the personal asset distribution data so as to acquire the transaction behavior pattern data of the investors;

specifically, for example, an appropriate transaction behavior pattern recognition method such as a machine learning algorithm, time series analysis, or the like may be selected. And selecting a proper model or algorithm to identify the transaction behavior mode according to specific requirements and data characteristics. The high frequency trade data and personal asset distribution data of the investors are analyzed and modeled using selected methods to identify patterns of trade behavior of the investors. Patterns in terms of investors' trading frequency, trade size, asset allocation, etc. can be explored.

Step S65, carrying out credit risk assessment on the investors according to the investor transaction behavior pattern data so as to obtain investor credit risk assessment data;

Specifically, for example, an appropriate credit risk assessment index, such as transaction frequency, transaction size, warehouse-holding concentration, etc., may be selected according to business needs and risk management criteria. Based on the selected credit risk assessment index, an assessment model is established, and statistical analysis, machine learning and other methods can be used. Model training and validation is performed using past investor data as historical data to optimize predictive power and stability of the model. The historical data may include data related to investors' transaction records, holding information, asset distribution, frequency of transactions, etc. The credit risk assessment index system for investors can be formulated by combining industry experience and expert opinion. And carrying out credit risk assessment on the investors according to the transaction behavior pattern data of the investors and the constructed credit risk assessment model. The assessment result may be a quantitative score or grade reflecting the level of credit risk of the investor.

Step S66, credit risk measurement and analysis are carried out on the investor credit risk assessment data so as to obtain investor credit risk assessment data, credit files of corresponding investors are intelligently marked by utilizing the investor credit risk assessment data so as to obtain investor credit file label data, and the investor credit file label data are sent to enterprises to which target financial products belong.

Specifically, for example, credit risk assessment data of investors may be measured and analyzed, and indexes such as average risk level, risk distribution and the like may be calculated. The performance and risk exposure of investors of different risk classes in the market was further analyzed. And according to the credit risk assessment data of the investors, intelligently labeling corresponding credit archive labels for each investor. The labels may be risk classes, risk categories, etc. for quickly identifying and classifying the investors' credit risk levels. And sending the credit archive label data of the investor to the enterprise to which the target financial product belongs. Tag data may be provided to the enterprise by way of a data interface, file transfer, etc. to support risk management and decision making processes.

The invention can acquire the transaction record and account information of the investor on the target financial product by collecting the account record data of the investor. Such data may provide information about investors' trading behavior, preferences, and risk tolerance, etc., providing underlying data for risk assessment and personalized recommendations for subsequent steps. The high-frequency transaction behavior of the investor can be identified by performing high-frequency transaction detection on the investor account record data. These high frequency trading data can help to learn the trading activity and trading preferences of investors and further analyze the risk preferences and investment behavior of investors. Through statistics and analysis of investors 'asset data, investors' asset structures and asset configurations can be known. These personal asset distribution data can help assess investors' financial status and risk tolerance, providing basis for subsequent credit risk assessment and personalized recommendations. By conducting transaction pattern recognition on the investor high frequency transaction data and the personal asset distribution data, the transaction patterns and preferences of the investor can be known. The transaction behavior pattern data can help to know investment strategies and risk preferences of investors, and provide basis for subsequent credit risk assessment and personalized recommendation. By performing credit risk assessment on investor transaction pattern data, investors' credit status and risk of default can be assessed. These credit risk assessment data may help financial institutions learn the credit risk level of investors, providing basis for risk management and decision making. By measuring and analyzing the credit risk assessment data of the investors, the credit condition and the risk level of the investors can be comprehensively assessed. The credit risk assessment data and the credit archive label data can help financial institutions to better know credit conditions and risk characteristics of investors, provide basis for risk management and decision making, and provide personalized services and product recommendation for target financial products.

The present embodiments are, therefore, to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein.

The foregoing is only a specific embodiment of the invention to enable those skilled in the art to understand or practice the invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims

1. The financial risk level marking method based on data mining is characterized by comprising the following steps of:

Step S4, acquiring enterprise operation data of a target financial product to obtain an enterprise operation data set, extracting features and reducing dimensions of the enterprise operation data set to obtain product management risk feature data, grading the product management risk feature data to obtain product potential risk data, wherein the step S4 comprises the following steps:

step S45, inputting the product management risk feature vector into a product management risk rating model to perform feature mapping and management risk prediction so as to acquire product potential risk data;

Step S6, carrying out target financial product account related record acquisition on an investor so as to obtain investor account record data, carrying out personal asset distribution analysis on the investor so as to obtain personal asset distribution data, carrying out credit risk assessment on the investor according to the investor account record data and the personal asset distribution data so as to obtain investor credit risk assessment data, and sending the investor credit risk assessment data to an enterprise to which a target financial product belongs, wherein the step S6 comprises the following steps:

Step S64, carrying out transaction behavior pattern recognition on the high-frequency transaction data of the investors and the personal asset distribution data so as to acquire the transaction behavior pattern data of the investors, wherein the transaction behavior pattern data comprises patterns of exploring the transaction frequency, the transaction scale and the asset configuration of the investors;

2. The method for labeling financial risk levels based on data mining according to claim 1, wherein the step S2 comprises the steps of:

s21, acquiring data acquisition target city data;

3. The method for labeling financial risk level based on data mining according to claim 2, wherein step S24 performs reliability calculation on each data source in the economic index data set by using a data source reliability calculation formula, wherein the data source reliability calculation formula is as follows:

;

In the formula, For the trustworthiness of the data source,The number of economic index data acquired for the data source,A serial number of economic index data acquired for the data source,Is the firstDeviation of the individual economic indicator data from the true value,In order to achieve a peripheral rate of the material,Is the firstThe standard deviation of the individual economic index data,Is the firstThe acquisition time of the economic index data,Is the base of the natural logarithm,Is an arbitrary positive real number.

4. The method for labeling financial risk levels based on data mining according to claim 2, wherein the step S25 comprises the steps of:

5. The method for data mining-based financial risk level annotation of claim 4, wherein step S254 comprises the steps of:

6. The method for labeling financial risk levels based on data mining according to claim 5, wherein the step S2544 comprises the steps of:

7. The method for labeling financial risk levels based on data mining according to claim 1, wherein the step S3 comprises the steps of:

8. The method for labeling financial risk levels based on data mining according to claim 1, wherein the step S5 comprises the steps of:

Step S53, acquiring a market fluctuation data set;