[go: up one dir, main page]

CN112733015A - User behavior analysis method, device, equipment and medium - Google Patents

User behavior analysis method, device, equipment and medium Download PDF

Info

Publication number
CN112733015A
CN112733015A CN202011612631.0A CN202011612631A CN112733015A CN 112733015 A CN112733015 A CN 112733015A CN 202011612631 A CN202011612631 A CN 202011612631A CN 112733015 A CN112733015 A CN 112733015A
Authority
CN
China
Prior art keywords
analysis
data
algorithm
stage
determining
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011612631.0A
Other languages
Chinese (zh)
Other versions
CN112733015B (en
Inventor
李阳
黄�俊
潘登
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nsfocus Technologies Inc
Nsfocus Technologies Group Co Ltd
Original Assignee
Nsfocus Technologies Inc
Nsfocus Technologies Group Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nsfocus Technologies Inc, Nsfocus Technologies Group Co Ltd filed Critical Nsfocus Technologies Inc
Priority to CN202011612631.0A priority Critical patent/CN112733015B/en
Publication of CN112733015A publication Critical patent/CN112733015A/en
Application granted granted Critical
Publication of CN112733015B publication Critical patent/CN112733015B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/16Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Business, Economics & Management (AREA)
  • Mathematical Physics (AREA)
  • Pure & Applied Mathematics (AREA)
  • Computational Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Economics (AREA)
  • Human Resources & Organizations (AREA)
  • Mathematical Optimization (AREA)
  • Mathematical Analysis (AREA)
  • Strategic Management (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Computing Systems (AREA)
  • Operations Research (AREA)
  • General Business, Economics & Management (AREA)
  • Marketing (AREA)
  • Game Theory and Decision Science (AREA)
  • Algebra (AREA)
  • Tourism & Hospitality (AREA)
  • Development Economics (AREA)
  • Software Systems (AREA)
  • Quality & Reliability (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The invention discloses a user behavior analysis method, a user behavior analysis device, user behavior analysis equipment and a user behavior analysis medium, which are used for solving the problem that in the prior art, the detection analysis result is inaccurate due to the fact that a user manually selects an algorithm used by a behavior analysis platform. According to the embodiment of the invention, aiming at the first service scene to which the first data source belongs, the corresponding first analysis algorithm can be recommended for each first analysis stage in the first data source, and then the first analysis algorithm corresponding to each first analysis stage is adopted according to the execution sequence of each first analysis stage to perform user behavior analysis on the first data to be analyzed in the first data source.

Description

User behavior analysis method, device, equipment and medium
Technical Field
The present invention relates to the field of network security technologies, and in particular, to a method, an apparatus, a device, and a medium for user behavior analysis.
Background
In the field of network security, more and more modeling methods and platforms for user behavior analysis are used at present, and particularly in a software layer, in order to effectively detect potential abnormal behaviors of malicious users, a behavior analysis platform often needs to perform function amplification, such as adding a detection method, a detection engine, a detection rule and the like.
However, the conventional behavior analysis platform requires a user to configure the behavior analysis platform on a front-end interface, the user needs to select different algorithms under different menus in the configuration process, and also needs to select various information such as parameters and parameter values of the algorithms in the algorithm selection process, which results in a complex process. For users who do not know the technology, it is difficult to select a proper algorithm, and the user who selects a wrong algorithm cannot achieve the expected detection analysis result. And with the amplification of functions, the difficulty of user selection is greatly improved, and finally, the expected detection and analysis result cannot be achieved.
In summary, the existing behavior analysis platform needs a user to manually select an algorithm, is difficult to implement, has complex flow, and is difficult to achieve an expected detection analysis result.
Disclosure of Invention
The invention provides a user behavior analysis method, a user behavior analysis device, user behavior analysis equipment and a user behavior analysis medium, which are used for solving the problem that in the prior art, the detection analysis result is inaccurate due to the fact that a user manually selects an algorithm used by a behavior analysis platform.
In a first aspect, a method for analyzing user behavior is provided, where the method includes:
determining a first data source, determining a first service scene to which the first data source belongs, and determining at least one first analysis stage corresponding to the first data source, wherein the first data source comprises first data to be analyzed;
determining a first analysis algorithm corresponding to each first analysis stage in the at least one first analysis stage according to a first service scenario to which the first data source belongs;
and according to the execution sequence of the at least one first analysis stage, adopting a first analysis algorithm corresponding to each first analysis stage to analyze the user behavior of the first data.
Further, the determining, according to the first service scenario to which the first data source belongs, a first analysis algorithm corresponding to each first analysis stage in the at least one first analysis stage includes:
recommending at least one analysis algorithm for each first analysis stage according to the first service scene to which the first data source belongs; and determining the analysis algorithm selected by the user in the at least one analysis algorithm as the first analysis algorithm corresponding to the first analysis stage.
Further, the recommending, for each first analysis stage according to the first service scenario to which the first data source belongs, at least one analysis algorithm for the first analysis stage includes:
and recommending at least one analysis algorithm for each first analysis stage according to the first service scene to which the first data source belongs, and outputting the recommended level of each analysis algorithm.
Further, after the determining the first data source, the method further includes:
identifying a first data format of the first data source;
preprocessing the first data according to the first data format, wherein different data formats correspond to different preprocessing modes;
according to the execution sequence of the at least one first analysis stage, performing user behavior analysis on the first data by adopting a first analysis algorithm corresponding to each first analysis stage, including:
and according to the execution sequence of the at least one first analysis stage, adopting a first analysis algorithm corresponding to each first analysis stage to analyze the user behavior of the preprocessed first data.
Further, the analysis algorithm includes at least one of the following algorithms: a rule analysis algorithm, a statistical analysis algorithm, a time series analysis algorithm, a machine learning algorithm, an improved hidden Markov model A-HMM algorithm.
Further, the A-HMM algorithm satisfies the following equation:
Figure BDA0002873323210000031
wherein AT is the optimized state transition matrix, T is the state transition matrix, alpha is the attention weight, alpha is not less than 0 and not more than 1, beta is the attention factor, i is a positive integer not less than 1 and not more than N, and N is the total number of elements in the state transition matrix.
Further, the method also comprises the following steps:
a: selecting the first data as the currently analyzed second data from at least one second data in the first data set;
b: analyzing the currently analyzed second data; if a single point anomaly is detected for the currently analyzed second data, increasing the current value of the alpha by a first value; if a context anomaly is detected for the currently analyzed second data, reducing the current value of the alpha by a second value; otherwise, keeping the current value of the alpha unchanged;
c: judging whether the currently analyzed second data is the last data; if not, executing D; if yes, executing E;
d: selecting the next data of the currently analyzed second data from the at least one second data as the currently analyzed second data, and returning to the step B;
e: determining a current value of the alpha as a final value of the alpha.
Further, the method also comprises the following steps:
determining an abnormal behavior threat level of at least one third data in a second data set aiming at each third data in the third data, and if the abnormal behavior threat level of the third data exceeds a threat level threshold value, determining the third data as malicious sample data;
determining a second analysis algorithm corresponding to at least one second analysis stage in a second service scene and an analysis result of the second analysis algorithm on the malicious sample data, and determining a first score of the second analysis algorithm in the second analysis stage;
determining a second score of a combination of at least one second analysis algorithm in the second service scenario according to the first score of the second analysis algorithm of each second analysis stage;
for each second analysis algorithm in the at least one second analysis algorithm, determining a recommendation level of the second analysis algorithm in the corresponding second analysis stage in the second business scenario according to the first score and the second score of the second analysis algorithm.
Further, the method also comprises the following steps:
and updating the second data set by adopting an analysis result obtained by analyzing the user behavior of the first data.
Further, the method also comprises the following steps:
and if the abnormal behavior is detected to exist in the first data and a third service scene related to the first service scene exists, outputting alarm information of the abnormal behavior existing in the first service scene and the third service scene.
In a second aspect, there is provided a user behavior analysis apparatus, the apparatus comprising:
the system comprises a determining module, a determining module and a analyzing module, wherein the determining module is used for determining a first data source, determining a first service scene to which the first data source belongs and at least one first analyzing stage corresponding to the first data source, and the first data source comprises first data to be analyzed; determining a first analysis algorithm corresponding to each first analysis stage in the at least one first analysis stage according to a first service scenario to which the first data source belongs;
and the analysis module is used for performing user behavior analysis on the first data by adopting a first analysis algorithm corresponding to each first analysis stage according to the execution sequence of the at least one first analysis stage.
Further, the determining module is specifically configured to recommend, for each first analysis stage, at least one analysis algorithm for the first analysis stage according to the first service scenario to which the first data source belongs; and determining the analysis algorithm selected by the user in the at least one analysis algorithm as the first analysis algorithm corresponding to the first analysis stage.
Further, the determining module is specifically configured to recommend at least one analysis algorithm for each first analysis stage according to the first service scenario to which the first data source belongs, and output a recommendation level of each analysis algorithm.
Further, the apparatus further comprises:
a preprocessing module to identify a first data format of the first data source; preprocessing the first data according to the first data format, wherein different data formats correspond to different preprocessing modes;
the analysis module is specifically configured to perform user behavior analysis on the preprocessed first data by using a first analysis algorithm corresponding to each first analysis stage according to the execution sequence of the at least one first analysis stage.
Further, the analysis algorithm includes at least one of the following algorithms: a rule analysis algorithm, a statistical analysis algorithm, a time series analysis algorithm, a machine learning algorithm, an improved hidden Markov model A-HMM algorithm.
Further, the A-HMM algorithm satisfies the following equation:
Figure BDA0002873323210000051
wherein AT is the optimized state transition matrix, T is the state transition matrix, alpha is the attention weight, alpha is not less than 0 and not more than 1, beta is the attention factor, i is a positive integer not less than 1 and not more than N, and N is the total number of elements in the state transition matrix.
Further, the determining module is further configured to perform the following steps:
a: selecting the first data as the currently analyzed second data from at least one second data in the first data set;
b: analyzing the currently analyzed second data; if a single point anomaly is detected for the currently analyzed second data, increasing the current value of the alpha by a first value; if a context anomaly is detected for the currently analyzed second data, reducing the current value of the alpha by a second value; otherwise, keeping the current value of the alpha unchanged;
c: judging whether the currently analyzed second data is the last data; if not, executing D; if yes, executing E;
d: selecting the next data of the currently analyzed second data from the at least one second data as the currently analyzed second data, and returning to the step B;
e: determining a current value of the alpha as a final value of the alpha.
Further, the determining module is further configured to determine, for each third data in at least one third data in the second data set, an abnormal behavior threat level of the third data, and if the abnormal behavior threat level of the third data exceeds a threat level threshold, determine that the third data is malicious sample data; determining a second analysis algorithm corresponding to at least one second analysis stage in a second service scene and an analysis result of the second analysis algorithm on the malicious sample data, and determining a first score of the second analysis algorithm in the second analysis stage; determining a second score of a combination of at least one second analysis algorithm in the second service scenario according to the first score of the second analysis algorithm of each second analysis stage; for each second analysis algorithm in the at least one second analysis algorithm, determining a recommendation level of the second analysis algorithm in the corresponding second analysis stage in the second business scenario according to the first score and the second score of the second analysis algorithm.
Further, the apparatus further comprises:
and the updating module is used for updating the second data set by adopting an analysis result obtained by analyzing the user behavior of the first data.
Further, the apparatus further comprises:
and the alarm module is used for outputting alarm information of the abnormal behaviors existing in the first service scene and the third service scene if the abnormal behaviors existing in the first data and the third service scene associated with the first service scene are detected.
In a third aspect, an electronic device is provided, where the electronic device includes at least a processor and a memory, and the processor is configured to implement the steps of any of the user behavior analysis methods described above when executing a computer program stored in the memory.
In a fourth aspect, a computer-readable storage medium is provided, which stores a computer program, and the computer program, when executed by a processor, implements the steps of any of the user behavior analysis methods described above.
According to the embodiment of the invention, aiming at the first service scene to which the first data source belongs, the corresponding first analysis algorithm can be recommended for each first analysis stage in the first data source, and then the first analysis algorithm corresponding to each first analysis stage is adopted according to the execution sequence of each first analysis stage to perform user behavior analysis on the first data to be analyzed in the first data source.
Drawings
Fig. 1 is a schematic diagram of a user behavior analysis process according to an embodiment of the present invention;
fig. 2 is a schematic diagram of a front-end interface of a behavior analysis platform according to an embodiment of the present invention;
fig. 3 is a schematic diagram of a behavior analysis platform listener according to an embodiment of the present invention;
fig. 4 is a schematic diagram of a back-end processing process of a behavior analysis platform according to an embodiment of the present invention;
fig. 5 is a schematic diagram of a user behavior analysis apparatus according to an embodiment of the present invention;
fig. 6 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention clearer, the present invention will be described in further detail with reference to the accompanying drawings, and it is apparent that the described embodiments are only a part of the embodiments of the present invention, not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In order to improve the accuracy of a detection analysis result of user behavior analysis, embodiments of the present invention provide a user behavior analysis method, apparatus, device, and medium.
Example 1:
fig. 1 is a schematic diagram of a user behavior analysis method provided in an embodiment of the present invention, where the method includes the following steps:
s101: determining a first data source, determining a first service scenario to which the first data source belongs, and determining at least one first analysis stage corresponding to the first data source, wherein the first data source comprises first data to be analyzed.
The user behavior analysis method provided by the embodiment of the invention is applied to the electronic equipment, for example, the electronic equipment can be a server or protective equipment provided with a behavior analysis platform. The behavior analysis platform is a behavior analysis system, and the behavior analysis system is generally a software system. It can be understood that the user behavior analysis method provided by the embodiment of the present invention may be directly installed on a user device (such as the above server or the above protection device), and may also provide an interface for a third party platform, so that the third party platform has a user behavior analysis function, and then the third party platform is installed on the user device.
The user behavior analysis method provided by the embodiment of the invention can be applied to scenes related to behavior analysis, such as abnormity detection, evaluation, trend prediction and the like. The embodiment of the invention mainly takes an abnormal behavior detection scene of a malicious user as an example for explanation.
The electronic device may determine a first data source, where the first data source may be a file such as log file information or a database that stores data, or may be a physical device that stores data. The electronic device may perform user behavior analysis on each data source that can be obtained or accessed, where the data source currently performing the user behavior analysis is the first data source, or the electronic device may perform the user behavior analysis on the data source selected by the user, where the data source selected by the user is the first data source. The first data source includes first data to be analyzed, and the first data may include one or more types of data information such as session data, access data, a web log, or a system log.
In the embodiment of the present invention, an analysis algorithm may be intelligently recommended based on a service scenario to which a data source belongs, where the service scenario may include, but is not limited to: file transmission service scenarios (such as illegal download and upload in a monitoring intranet), access service scenarios (such as abnormal login) and the like. The electronic device may determine the first service scenario to which the electronic device belongs according to the first data source or according to first data included in the first data source, for example, the electronic device may determine, according to a session data log or an access data log and other first data sources, that the first service scenario to which the first data source belongs is an access service scenario. Optionally, the electronic device may pre-store a corresponding relationship between the data source and the service scenario, or the electronic device may perform preliminary analysis on the data source to determine the service scenario to which the data source belongs.
The electronic device may also determine at least one first analysis stage corresponding to the first data source. Optionally, the electronic device may pre-store a corresponding relationship between a data source and at least one analysis stage, or the electronic device may pre-store a corresponding relationship between a service scenario and at least one analysis stage.
S102: and determining a first analysis algorithm corresponding to each first analysis stage in the at least one first analysis stage according to the first service scene to which the first data source belongs.
The electronic equipment can train different combined models in different service scenes in advance according to historical data. The electronic device may store a model library, where different combination models and a score (or a weight) of each combination model with respect to different service scenarios are stored in the model library, where the score of each combination model with respect to different service scenarios is used to indicate a degree to which each combination model is adapted to different service scenarios. For example, the model library stores the combination model 1 and the combination model 2, and stores the score of the combination model 1 with respect to the business scenario 1, the score of the combination model 1 with respect to the business scenario 2, the score of the combination model 2 with respect to the business scenario 1, and the score of the combination model 2 with respect to the business scenario 2.
Wherein the combined model comprises an analysis algorithm corresponding to each analysis stage. For example, the combined model includes an analysis stage 1 and a corresponding analysis algorithm 11, and an analysis algorithm 21 corresponding to an analysis stage 2, and the combined model includes an analysis algorithm 12 corresponding to an analysis stage 1 and an analysis algorithm 22 corresponding to an analysis stage 2.
In a possible manner, in S102, the electronic device may determine at least one combination model corresponding to the first service scenario, then select a target combination model from the at least one combination model (for example, select a combination model with a highest score relative to the first service scenario as the target combination model), and use an analysis algorithm corresponding to each analysis stage in the target combination model as a first analysis algorithm corresponding to each first analysis stage in the at least one first analysis stage, respectively.
In this manner, the user may participate in the selection of the target combination model, for example, the electronic device displays all combination models or a part of combination models corresponding to the first service scenario, and displays a score of each combination model relative to the first service scenario, and the user may select the target combination model according to the displayed score of each combination model relative to the first service scenario.
Another possible way, in S102, the electronic device determines at least one combination model corresponding to the first service scenario, and then determines, for each combination model in the at least one combination model, a score of an analysis algorithm corresponding to each analysis stage in the combination model, and the electronic device determines, according to the score of the analysis algorithm corresponding to each first analysis stage, each first analysis algorithm corresponding to each first analysis stage. For example, the electronic device determines a score of an analysis algorithm 11 corresponding to the analysis stage 1 and a score of an analysis algorithm 12 corresponding to the analysis stage 1, and the electronic device determines a first analysis algorithm corresponding to the analysis stage 1 according to the score of the analysis algorithm 11 and the score of the analysis algorithm 12.
In this manner, the user may participate in the selection of the first analysis algorithm, for example, the electronic device displays all or part of the analysis algorithms corresponding to the first analysis stage in each first analysis stage, and displays the score of each analysis algorithm, and the user may select the first analysis algorithm corresponding to the first analysis stage according to the displayed score of each analysis algorithm.
S103: and according to the execution sequence of the at least one first analysis stage, adopting a first analysis algorithm corresponding to each first analysis stage to analyze the user behavior of the first data.
When the electronic device determines at least one first analysis stage corresponding to the first data source, the electronic device may further determine an execution order of the at least one first analysis stage. For example, the electronic device may pre-store an execution sequence of at least one first analysis stage.
In S103, the electronic device may process the first data according to the execution sequence of the at least one first analysis stage by using a first analysis algorithm corresponding to a first analysis stage, then continue to process the first data processed by the first analysis stage by using a first analysis algorithm corresponding to a second first analysis stage, and so on until all the first analysis stages are completed.
After the electronic device completes the analysis processing of all the first analysis stages, the electronic device may output the analysis result of the user behavior analysis, for example, output the alarm information of the existing abnormal behavior event, or output the information of the absence of the abnormal behavior event.
According to the embodiment of the invention, aiming at the first service scene to which the first data source belongs, the corresponding first analysis algorithm can be recommended for each first analysis stage in the first data source, and then the first analysis algorithm corresponding to each first analysis stage is adopted according to the execution sequence of each first analysis stage to perform user behavior analysis on the first data to be analyzed in the first data source.
Example 2:
in order to improve flexibility in the user behavior analysis process, on the basis of the above embodiment, in an embodiment of the present invention, the determining, according to the first service scenario to which the first data source belongs, the first analysis algorithm corresponding to each first analysis stage in the at least one first analysis stage includes:
recommending at least one analysis algorithm for each first analysis stage according to the first service scene to which the first data source belongs; and determining the analysis algorithm selected by the user in the at least one analysis algorithm as the first analysis algorithm corresponding to the first analysis stage.
The user behavior platform can directly recommend a corresponding analysis algorithm for each analysis stage according to the service scene, and certainly, the user can participate in selecting the analysis algorithm in each analysis stage, so that the flexibility in the user behavior analysis process is improved, and the method is more suitable for user behavior analysis in the first service scene.
The electronic device may save scores of different combination models with respect to a first service scenario to which the first data source belongs. The electronic device may determine, for each combined model, a score of an analysis algorithm corresponding to each analysis stage in the combined model, and at this time, the electronic device obtains, for each first analysis stage, a score of each analysis algorithm corresponding to the first analysis stage, and the electronic device recommends at least one analysis algorithm for the first analysis stage based on the score of each analysis algorithm corresponding to the first analysis stage, and it may be understood that the at least one analysis algorithm recommended by the electronic device for the first analysis stage may be all analysis algorithms or part of analysis algorithms (for example, the first three analysis algorithms with the highest scores) corresponding to the first analysis stage.
Specifically, the electronic device may display a set number of analysis algorithms for the user to select according to the score of each analysis algorithm corresponding to each first analysis stage from high to low. The set number may be any positive integer, and is not limited in the embodiment of the present invention.
The electronic equipment recommends at least one analysis algorithm for each first analysis stage, the user only needs to select the analysis algorithm suitable for the first service scene from the analysis algorithms recommended by the electronic equipment, the selection difficulty is greatly reduced, and the electronic equipment determines the analysis algorithm selected by the user from the at least one analysis algorithm as the first analysis algorithm corresponding to the first analysis stage.
Because the embodiment of the invention recommends at least one analysis algorithm for the user aiming at each first analysis stage, the user can select the first analysis algorithm corresponding to the first analysis stage according to the actual requirement, so that the flexibility in the user behavior analysis process is improved, and the method is more suitable for user behavior analysis in the first service scene.
Example 3:
in order to further reduce the difficulty in using the analysis platform, on the basis of the foregoing embodiments, in an embodiment of the present invention, for each first analysis stage according to the first business scenario to which the first data source belongs, recommending at least one first analysis algorithm for the first analysis stage includes:
and recommending at least one analysis algorithm for each first analysis stage according to the first service scene to which the first data source belongs, and outputting the recommended level of each analysis algorithm.
The electronic equipment outputs a recommended level of each analysis algorithm recommended for each first analysis stage, wherein the recommended level may be represented by a display order from front to back (for example, the more front analysis algorithm recommended level is displayed, the higher the recommendation level is), or the recommended level may be represented by a sequence number assigned to each analysis algorithm (for example, the lower the sequence number is, the higher the recommendation level of the analysis algorithm of sequence number 1 is, the higher the recommendation level of the analysis algorithm of sequence number 2 is), or the recommended level may be determined by a score of each analysis algorithm (for example, the higher the recommendation level of the analysis algorithm of the score is).
In the embodiment of the invention, the user can select the first analysis algorithm in the analysis stage more intuitively and quickly through the recommendation level of each analysis algorithm in the analysis stage, so that the use difficulty of the analysis platform is further reduced, and the flexibility of the user in using the analysis platform is improved.
Example 4:
data formats of different data sources are different, and a general analysis platform can only process data in a single format, so that in the prior art, a user needs to manually process initial data of the data sources into the data format supported by the analysis platform for different users, the analysis platform can perform subsequent analysis and processing, the processing mode is too rigid and not intelligent enough, the technical capability requirement exists for the user, and the use difficulty of the analysis platform is too high. Therefore, in order to further reduce the difficulty in using the analysis platform and improve the applicability of the analysis platform to different data sources, on the basis of the foregoing embodiments, in the embodiment of the present invention, after the determining the first data source, the method further includes:
identifying a first data format of the first data source;
preprocessing the first data according to the first data format, wherein different data formats correspond to different preprocessing modes;
according to the execution sequence of the at least one first analysis stage, performing user behavior analysis on the first data by adopting a first analysis algorithm corresponding to each first analysis stage, including:
and according to the execution sequence of the at least one first analysis stage, adopting a first analysis algorithm corresponding to each first analysis stage to analyze the user behavior of the preprocessed first data.
In order to further reduce the difficulty in using the analysis platform and improve the applicability of the analysis platform to different data sources, in an embodiment of the present invention, after the electronic device determines the first data source, the first data format of the first data source may also be identified, that is, the data format of the first data included in the first data source is the first data format.
Data formats include, but are not limited to, structured data, semi-structured data, and unstructured data. For example, if the first data source is a relational database, the electronic device may determine that the data of the first data source is structured data; if the first data source is a log file acquired by a machine or equipment, the electronic equipment can determine that the data of the first data source is semi-structured data; if the first data source is a collected webpage file or encyclopedia data file, the electronic device may determine that the data of the first data source is unstructured data.
The electronic device may pre-store a corresponding relationship between a data format and a preprocessing mode, where different data formats correspond to different preprocessing modes. After the electronic device identifies the first data format of the first data source, the electronic device may perform preprocessing on the first data according to the first data format by using a first preprocessing mode corresponding to the first data format. The preprocessed first data is in a data format which can be processed by the behavior analysis platform.
The subsequent electronic device may also perform user behavior analysis on the preprocessed first data by using a first analysis algorithm corresponding to each first analysis stage according to the execution sequence of the at least one first analysis stage.
In the embodiment of the invention, different preprocessing modes are provided for data in different data formats, so that automatic processing of electronic equipment can be realized, manual processing by a user is avoided, the use difficulty of a behavior analysis platform is further reduced, the method and the device can adapt to data sources in different data formats, and the applicability of the behavior analysis platform is further improved.
Example 5:
in order to further improve the accuracy of the analysis and detection result of the behavior analysis platform, on the basis of the above embodiments, in the embodiments of the present invention, the analysis algorithm includes at least one of the following algorithms: a rule analysis algorithm, a statistical analysis algorithm, a time series analysis algorithm, a machine learning algorithm, an a-HMM (hidden Markov model) algorithm.
In the embodiment of the present invention, an algorithm library may be built in the electronic device, the algorithm library includes a plurality of analysis algorithms, and the plurality of analysis algorithms may include conventional analysis algorithms such as a rule analysis algorithm, a statistical analysis algorithm, a time series analysis algorithm, a machine learning algorithm, and the like, and may also store an improved algorithm proposed on the conventional analysis algorithms, such as an a-HMM algorithm.
The rule analysis algorithm may specify an alarm field and/or a 3-sigma rule algorithm, where the electronic device may determine whether an abnormal behavior exists by looking up whether the alarm field exists in the first data, or the electronic device may determine whether an abnormal behavior exceeding a standard interval exists in the first data by using the 3-sigma rule algorithm, and so on.
The electronic device may statistically analyze abnormal behavior existing in the first data according to the statistical analysis algorithm.
The electronic equipment can extract the original behaviors of the user according to the time sequence analysis algorithm and predict the future behaviors of the user, so that whether abnormal behaviors exist or not is analyzed.
The machine learning algorithm can learn specific service scenes so as to match different service scenes, so that the analysis of abnormal behaviors can be realized under different service scenes, and the analysis and detection results can be further improved.
The A-HMM algorithm is mainly explained below. In the modeling process of the traditional HMM algorithm, an initialization state transition matrix needs to be determined at first, and the state transition matrix is obtained by performing iterative computation on the initialization state transition matrix when a convergence condition is reached. However, in the conventional HMM algorithm, the distribution of probability values of behavior features of abnormal users in a state transition matrix has a significant difference, but most of users are normal users, and the variation difference between the probability values of the behavior features of the normal users in the state transition matrix is small, in this case, the state transition matrix corresponding to the abnormal user is affected by the normal user, so that after the final state transition matrix is converged, the state transition matrix of the normal user can assimilate the state transition matrix of the abnormal user, which is similar to the idea of averaging, and the final converged state transition matrix is difficult to pay attention to the abnormal user, and there is a false report of the abnormal behavior.
In order to solve the above problems of the conventional HMM algorithm, in the embodiment of the present invention, an attention weight α and an attention factor β are introduced on the basis of the conventional HMM algorithm to improve and optimize the conventional HMM algorithm, so as to obtain an a-HMM algorithm. The A-HMM algorithm satisfies the following equation:
Figure BDA0002873323210000151
wherein AT is the optimized state transition matrix, T is the state transition matrix, alpha is the attention weight, alpha is not less than 0 and not more than 1, beta is the attention factor, i is a positive integer not less than 1 and not more than N, and N is the total number of elements in the state transition matrix.
The attention weight value α may be preset in the electronic device, or may be learned through a sample data set. The attention factor β may have any value, and is not limited in the embodiment of the present invention.
The A-HMM algorithm focuses more on abnormal behaviors, and can detect single-point abnormal behaviors or contextual abnormal behaviors which may exist so as to further improve the analysis detection result.
In the embodiment of the invention, the electronic equipment can detect different abnormal behaviors by embedding different analysis algorithms, thereby further improving the analysis and detection results of the behavior analysis platform.
Example 6:
in order to further improve the analysis and detection result, on the basis of the above embodiments, in the embodiment of the present invention, the method further includes:
a: selecting the first data as the currently analyzed second data from at least one second data in the first data set;
b: analyzing the currently analyzed second data; if a single point anomaly is detected for the currently analyzed second data, increasing the current value of the alpha by a first value; if a context anomaly is detected for the currently analyzed second data, reducing the current value of the alpha by a second value; otherwise, keeping the current value of the alpha unchanged;
c: judging whether the currently analyzed second data is the last data; if not, executing D; if yes, executing E;
d: selecting the next data of the currently analyzed second data from the at least one second data as the currently analyzed second data, and returning to the step B;
e: determining a current value of the alpha as a final value of the alpha.
In the embodiment of the invention, the traditional HMM algorithm introduces the attention weight and the attention factor, the initial state and the iterative process of the user are improved and optimized in a self-adaptive mode to obtain the A-HMM algorithm, and the obtained A-HMM algorithm can focus on abnormal behaviors and avoid the missing report of the abnormal behaviors by improving and optimizing the traditional HMM.
The electronic device may acquire a first data set, where the first data set includes at least one second data, where the at least one second data may include abnormal behavior data and may include normal behavior data, and the at least one second data may be regarded as sample data for learning an attention weight value α, and the electronic device may learn a final value of the α according to the at least one second data.
The alpha represents the attention degree of the user behavior in the initial state, the larger the value of the alpha is, the more attention paid to the user behavior in the initial state is represented, the single-point abnormity can be effectively detected, the single-point abnormity behavior can be better distinguished, the smaller the value of the alpha is, the more attention paid to the iteration process is represented, the context abnormity can be effectively detected by the user, and the context abnormity behavior can be better distinguished.
The electronic device may analyze each second data separately. For the second data analyzed currently, if a single-point anomaly is detected for the second data analyzed currently, the current value of α is increased by a first value, because for an abnormal user of a single-point anomalous behavior, the probability value change in the state transition matrix is large, while the distribution of the probability values of normal users in the state transition matrix is uniform, by increasing α, the single-point anomaly can be found more accurately and rapidly, α concerns the change of the probability values in the state transition matrix, and for the existence of significantly different probability values, the attention degree in the next iteration is enhanced by increasing α; if the context abnormality is detected according to the second data analyzed currently, the current value of the alpha is reduced by a second numerical value, because for the abnormal user with the context abnormal behavior, the abnormality can be detected only if a certain condition is satisfied, but often a plurality of iterations are needed, when the probability value in the state transition matrix changes to a certain range, whether the conditions are satisfied or not is possibly detected, the change condition of the state transition matrix in the iteration process needs to be concerned more, and the attention degree of the alpha to the iteration process is enhanced by reducing; if no single point anomaly or no context anomaly is detected for the currently analyzed second data, the current value of α may be kept unchanged.
It is understood that, when analyzing the first second data in the first data set, the current value of α is the initial value of α, and when analyzing the subsequent second data of the first second data, the current value of α is the α value determined when analyzing the last second data of the currently analyzed second data. After the last second data is analyzed, the α value determined when the last second data is analyzed may be determined as the final value of α, where the final value of α is the value of α in the a-HMM algorithm.
The first value and the second value may be any values, and the first value and the second value may be the same or different, and are not limited in the embodiment of the present invention.
It is to be noted that the a-HMM algorithm may form an algorithm alone for performing anomaly analysis, where the first data set may be an original data set, or the a-HMM algorithm may perform secondary verification on anomaly analysis results obtained by other algorithms, where the first data set may be an alarm data set.
The single-point abnormal behavior is generally realized by one abnormal operation of a malicious user, and the abnormal behavior of a login replay attack is a typical single-point abnormal behavior. A contextual exception action is generally the presence of multiple operations in succession, each operation suspected of reaching an alarm condition, but not yet completed reaching the alarm condition, such as a typical contextual exception action including: the alarm condition is that the download volume in the first time period reaches 80M, the electronic device detects that the download volume of a certain user at a first time point in the first time period is 40M, the download volume at a second time point in the first time period is 70M, and the download volume at a third time point in the first time period is 75M, it can be seen that the download volume of each time is suspected to reach the alarm condition of 80M, and the continuous multiple operations require multiple iterative detections to reach the existing context abnormal behavior.
In the embodiment of the invention, the electronic equipment improves and optimizes the initial state and the iterative process of the user in a self-adaptive mode by introducing the attention weight and the attention factor into the traditional HMM algorithm to obtain the A-HMM algorithm, and the A-HMM algorithm can focus on the abnormal behaviors and avoid the missing report of the abnormal behaviors.
Example 7:
in order to further reduce the difficulty in using the behavior analysis platform, on the basis of the foregoing embodiments, the embodiments of the present invention further include:
determining an abnormal behavior threat level of at least one third data in a second data set aiming at each third data in the third data, and if the abnormal behavior threat level of the third data exceeds a threat level threshold value, determining the third data as malicious sample data;
determining a second analysis algorithm corresponding to at least one second analysis stage in a second service scene and an analysis result of the second analysis algorithm on the malicious sample data, and determining a first score of the second analysis algorithm in the second analysis stage;
determining a second score of a combination of at least one second analysis algorithm in the second service scenario according to the first score of the second analysis algorithm of each second analysis stage;
for each second analysis algorithm in the at least one second analysis algorithm, determining a recommendation level of the second analysis algorithm in the corresponding second analysis stage in the second business scenario according to the first score and the second score of the second analysis algorithm.
In order to recommend a proper analysis algorithm to the user more accurately, the analysis algorithms in different analysis stages can be evaluated through various evaluation indexes. The evaluation indexes include traditional evaluation indexes such as analysis accuracy, precision, recall rate and F-score (F-score) value, and may also include newly added evaluation indexes, for example, a scene-based evaluation index, which is called CME, is added in the embodiment of the present invention. The CME fully considers the applicability of different analysis algorithms in a service scene, so that the CME can reduce or avoid the problem of false alarm missing possibly occurring in user behavior analysis.
The electronic device may obtain a second data set, where the second data set includes at least one third data, where the third data may store abnormal behavior data and may include normal behavior data, and the at least one third data may be regarded as historical data used when the CME performs evaluation, and the electronic device may evaluate analysis algorithms in different analysis stages according to the at least one third data.
The electronic device may determine a threat level for each third datum for abnormal behavior, and a threat level threshold may be stored in the electronic device, which may be specified by a platform developer. For each third data, if the threat level of the abnormal behavior of the third data exceeds the threat level threshold, the third data is malicious sample data or black sample data, the third data can be regarded as abnormal behavior data, if the threat level of the abnormal behavior of the third data does not exceed the threat level threshold, the third data is not malicious sample data or white sample data, and the third data can be regarded as normal behavior data.
When the subsequent CME performs evaluation, the subsequent CME may perform evaluation on each third data in the second data set, may also perform evaluation on each malicious sample data in the second data set, and optionally, may save each malicious sample in the second data set in a new third data set for evaluation. In the embodiment of the present invention, the evaluation of each malicious sample data is mainly described.
In a second service scenario, the electronic device may determine a second analysis algorithm corresponding to each second analysis stage in at least one second analysis stage, and for convenience of description, the second analysis algorithms corresponding to each second analysis stage are combined by using an integrated learning idea to obtain a combined model. Since there may be a plurality of analysis algorithms available in each second analysis stage, and a plurality of combination models may be determined in the second service scenario, in the embodiment of the present invention, a combination model 1 is taken as an example for description, where the combination model 1 includes an analysis stage 1 and a corresponding analysis algorithm 11, and an analysis algorithm 21 corresponding to an analysis stage 2.
For an analysis stage 1 in a second service scenario, the electronic device determines an analysis result of the analysis algorithm 11 on malicious sample data in the analysis stage 1, and determines a first score of the analysis algorithm 11 in the analysis stage 1. Likewise, the electronic device can determine a second score for the analysis algorithm 21 in the analysis stage 2.
The first score is used to identify how accurate the analysis algorithm 11 analyzes the results in the analysis stage 1. One possible way, the accuracy may be determined based on the existing evaluation index, for example, comparing the analysis result of the analysis algorithm 11 on malicious sample data in the analysis stage 1 with the analysis result of the malicious sample data saved in advance, and determining the first score of the analysis algorithm 11 in the analysis stage 12 according to the deviation of the comparison result. In another possible manner, the electronic device may determine the first score of the second analysis algorithm in the second analysis phase according to the following formula:
Figure BDA0002873323210000191
wherein H (V(s)i),V(sk) Is the first value of the second analysis algorithm in this second analysis phase, V(s)i) For the second analysis algorithm for the sample siAnalyzing the obtained abnormal behavior threat level, V(s)k) Is a node skThe actual abnormal behavior threat level is represented by i, i is a positive integer which is greater than 0 and not less than M, M represents the total data quantity (such as the total number of malicious sample data), M is a positive integer, K represents a kth node, K is a positive integer which is greater than 0 and not less than K, K represents the total number of analysis stages, and lambda is an adaptive smoothing factor.
After determining the first score of the second analysis algorithm of each analysis stage, the electronic device may determine, according to each first score, a second score of the combined model in the second service scenario. In one possible approach, at least one first score may be accumulated to obtain a second score. In another possible way, the electronic device is according to the following disclosureDetermining a second score of the combined model at the second business scenario:
Figure BDA0002873323210000201
wherein CME is the second score, the
Figure BDA0002873323210000202
If the ith sample data is black sample data, CiIs 1, if the ith sample data is white sample data, CiIs 0, the
Figure BDA0002873323210000203
Said t isiIs a sample siThe threat level factor of (2).
In the actual use process of the subsequent user behavior analysis, the electronic device may directly recommend different combination models for the actual first service scenario according to the second scores of the different combination models in the different service scenarios. Or for different analysis stages, the electronic device may recommend a different analysis algorithm directly for each first analysis stage in the actual first business scenario.
The electronic device may also determine the recommended rating of each second analysis algorithm in the corresponding second analysis stage, e.g., determine the recommended rating of the analysis algorithm 11 in the analysis stage 1, and determine the rating of the analysis algorithm 21 in the analysis stage 2.
In one possible approach, the electronic device may determine the recommended level of the second analysis algorithm at the corresponding second analysis stage directly according to the first value of the second analysis algorithm.
In another possible way, the electronic device may determine, for each second analysis algorithm, a recommendation level of the second analysis algorithm in the corresponding second analysis stage in the second service scenario according to the first score and the second score of the second analysis algorithm. Specifically, the electronic device may determine, for each second analysis algorithm, one or more of an adaptive smoothing factor, a threat level factor, and a history evaluation result of the second analysis algorithm, a first score and the second score of the second analysis algorithm, determine a third score of the second analysis algorithm, and then determine, according to the third score, a recommendation level of the second analysis algorithm at a corresponding second analysis stage.
In the embodiment of the invention, the CME evaluation index fully considers the service scene in the evaluation process, can adapt to different service scenes, and can determine the recommendation levels of the analysis algorithms in different analysis stages under the service scene, so that a more appropriate analysis algorithm can be recommended for a user, the continuous tracking of high-risk abnormal behaviors can be realized, and the false alarm omission is reduced.
Example 8:
in order to further improve the accuracy of the analysis and detection result, on the basis of the above embodiments, the embodiment of the present invention further includes:
and updating the second data set by adopting an analysis result obtained by analyzing the user behavior of the first data.
Optionally, the electronic device may re-evaluate the evaluation result of the index after each update of the second data set, or the electronic device may periodically update the evaluation result of the evaluation index.
In the embodiment of the invention, the second data set is updated, so that the CME evaluation index results in different service scenes can be updated, the applicability to different service scenes is improved, the accuracy of analysis algorithm recommendation in different analysis stages in the service scenes is improved, and the accuracy of analysis detection results is further improved.
Example 9:
in order to further improve the applicability of different service scenarios, on the basis of the foregoing embodiments, the embodiments of the present invention further include:
and if the abnormal behavior is detected to exist in the first data and a third service scene related to the first service scene exists, outputting alarm information of the abnormal behavior existing in the first service scene and the third service scene.
The user of the behavior analysis platform can associate different service scenes, and if abnormal behavior is detected in a first service scene, the electronic device can give an alarm to the abnormal behavior in a third service scene associated with the first service scene in time, so that the applicability of different service scenes and the timeliness of the alarm are further improved.
When detecting that the first data in the first data source has the abnormal behavior, the electronic device may directly output alarm information of the abnormal behavior in the first service scenario to which the first data source belongs, and output alarm information of the abnormal behavior in the third service scenario associated with the first service scenario.
Or when the electronic device detects that abnormal behavior exists in the first data source, the electronic device may perform user behavior analysis on a third service scenario associated with the first service scenario in time, and perform an alarm on the abnormal behavior obtained through the analysis.
The first service scenario and the third service scenario may be associated in a database using an association field.
In the embodiment of the invention, the electronic equipment not only carries out abnormal alarm on the first service scene which is currently analyzed, but also carries out abnormal alarm on the third service scene which is associated with the first service scene, so that the electronic equipment can adapt to more different service scenes and improve the timeliness of alarm.
Example 10:
the embodiments of the present invention are described with reference to a specific embodiment, and fig. 2 is a possible front-end interface provided by a behavior analysis platform (hereinafter, referred to as a platform), where a user can operate the platform through the front-end interface. And the left operation interface of the front-end interface is provided with operation buttons, and the operation buttons specifically comprise a data source button, a data processing button, a modeling analysis button and a result display button. The right operation interface of the front-end interface is displayed with a visualization area, and specifically comprises a data visualization area, a model expression visualization area and an alarm log (log) output area.
The following describes a process of analyzing user behavior by using a row platform by a user with reference to fig. 2.
The user clicks the data source button or directly drags the data source button to the data visualization area, the system prompts the user to select the data storage position, namely the data source, the platform displays the data source selected by the user in the data visualization area, optionally, the platform can also pop up a corresponding window according to different selection modes of the user, and the user can conveniently configure the data source according to friendly prompt of the platform. For example, the user selects the position where the data is stored as the database, the platform can pop up a window corresponding to the database, and prompts that a data table related to login information of the access user is recommended to be selected under an abnormal service access scene in the window, so that the user can conveniently perform specific configuration on the data source based on the prompt information. After the data source is configured by the user, the platform may import the data in the data source selected by the user into the platform for subsequent analysis. Referring to fig. 3, an interactive processing layer of the platform may store execution logic for a user to complete selection or drag and pull, and after the user sends an action, the platform responds to an action event triggered by the user through a listener program, and executes a corresponding response mode to complete an action instruction sent by the user, for example, the user may drag and pull the selected data source to the right side, and these operations may have a corresponding event listener to monitor an event initiated by the user, and trigger a listener corresponding to the event for the initiated event, and a listener at the back end responds to the operation of the user, and returns a result to the user. The method comprises the steps that a specific user clicks a data source button, an event selection monitor of a platform can monitor data source selection operation of the user, or the user drags the data source button to a data visualization area, a supporting and moving monitor of the platform can monitor dragging operation of the user on the data source, then a function selection event monitor of the platform monitors selection operation of the user on the data source, the platform responds, and a drawing event monitor of the platform displays obtained data source information in the data visualization area.
The model representation visualization area may now display the currently ongoing phase as the data source selection phase, and then mark that the data source selection phase is complete. Each functional stage under a service scene can be displayed in the model expression visualization area, then the functional stages are connected into a line structure, and the functional stages which are completed, are currently in progress and are not started can be respectively marked by adopting different marking modes.
Assuming that the user selects the first data source, the user continues to select the "data processing" button, the platform executes a preprocessing operation, and the first data in the first data source is preprocessed in a preprocessing mode corresponding to the data format of the first data source. For example, for unstructured text data, for some cluttered symbols such as "<,? %, @. ", data cleaning; the platform can perform missing value filling on data containing missing values; the platform can also perform data processing work such as normalization and the like.
The user continues to select a 'modeling analysis' button, and in the modeling analysis stage, the user can directly complete the establishment of the user behavior analysis model in a dragging and pulling mode. For example, a user drags a text data source into a data visualization area, and a listener of the platform responds to a user-triggered drag event and identifies the data format of the data source. When the data source is recognized as text data, the platform can pop up a function menu of a text data processing mode at a data processing button, for example, word segmentation operation is performed on the text, a user can drag a word segmentation function in the left function menu to a model representation visualization area on the right side, the platform finally returns a word segmentation result with the best effect by running various word segmentation algorithms, such as a rule-based forward and backward matching algorithm, a language model word segmentation method, a hidden Markov word segmentation method and a deep learning-based word segmentation method, and the platform can display various word segmentation algorithms for the user to select and then return the word segmentation result of the word segmentation algorithm selected by the user. By analogy, the user can drag each stage to the model visualization area in a dragging and pulling mode, and continue to select the next stage after each stage is completed until the selection and analysis of all stages are completed, at this time, the whole modeling work is completed.
Here, mainly taking the use process as an example, before the user formally uses the platform, the platform may be accessed to all scenes that may be involved, and then training is performed in each scene, so as to be more suitable for the actual use of the user.
After the whole modeling work is completed, the platform also completes the analysis of all stages, and then the analysis result can be displayed in an alarm log output area, wherein the alarm log output area can display that abnormal behaviors or abnormal users do not exist in the first data source, or can display that abnormal behaviors or abnormal users exist in the first data source, and more specifically can display specific information of the abnormal behaviors or the abnormal users.
Fig. 2 illustrates an embodiment of the present invention mainly from the perspective of front-end interaction with a user, and a back-end processing procedure of the behavior analysis platform will be described below, referring to fig. 4.
For the aspect of usability of the product, a user can directly drag and draw the product by using a mouse, analysis stage selection, algorithm selection and parameter configuration are not required to be performed under a complex menu like the traditional product, and the whole modeling analysis work can be completed by directly selecting according to prompt information in a platform.
For the functional aspect of the product, the platform comprises a full period of development and use of a behavior analysis model, which is a general term of the whole process of user behavior analysis on the data source and is referred to as a model hereinafter. The whole period of model development comprises multiple aspects of modeling preparation, data preprocessing, feature engineering, algorithm selection, model tuning, model output, model production and the like. The whole life cycle of the analytical modeling is covered, the process from scratch of an analytical platform prototype system is realized, a basic analytical modeling process is established and implemented in a product in a development stage, and the analytical modeling capability and a corresponding algorithm model are enriched and expanded step by step subsequently, so that the platform is continuously perfected and has complete analytical capability.
The modeling preparation platform provides a variety of functional modules including, but not limited to, one or more of the following: the system comprises a data source module, a data preprocessing module, a data management module, an algorithm support module, a model evaluation module and a multi-scene association module.
The platform can acquire the cloud data source, the local data source and the third-party data source through the data source module. For a data source, the platform may call an API (Application Programming Interface) corresponding to the data source to obtain data in the data source.
The data preprocessing module mainly comprises two functional parts: a data source identification function and a preprocessing function. The platform mainly identifies the data format imported by the data source through the data source identification function part and feeds the identified data format back to the preprocessing function part, the platform selects different preprocessing modes aiming at different data formats through the preprocessing function part, and the first data in the first data source is preprocessed by adopting the preprocessing mode corresponding to the data format of the first data source, so that the standardization of different data formats is realized.
The platform can determine a behavior analysis scene of a first data source through a data management model, then preprocess first data by combining the data preprocessing module according to different behavior analysis scenes and data, and then extract features and the like in the preprocessed first data based on feature engineering. By standardizing data in different formats, it is ensured that the data access platform is already standardized data before. The preprocessing process can be accessed in a plug-in mode, for example, each log with a new data format is subjected to butt joint processing through a program, the preprocessing function library is enriched along with the continuous increase of the log types, and the effects of dynamic expansion of the preprocessing function library and continuous improvement of log analysis capability are achieved.
For example, for a file exception transmission scene and json format data of the scene, the data is disassembled according to fields and converted into a DataFrame data format. And calling the API generated by the scene baseline to train a baseline model.
The platform has great difficulty from the beginning, and especially relates to a platform of a model algorithm, which needs to consider problems in multiple aspects such as compatibility, performance, accuracy, input and output and the like. Firstly, a main flow frame of the user can be established, modules of data processing, model analysis, data output and the like are stripped, the capability of supporting a self-research algorithm and a third-party algorithm is realized on the basis, the relevant models can use ISOP platform data for analysis only by providing jar packages and simple configuration files, and corresponding alarms are output.
The modeling process can be realized in a mode of random dynamic dragging and pulling on the user plane, the basic data capability of the ISOP (intelligent Security Operation Platform) is opened, and the analysis capability of the Platform is improved. And the platform establishes a model suitable for a user product through the test data.
A plurality of algorithm models are arranged in the algorithm support module, besides supporting the traditional rule and statistic-based analysis algorithm and time series analysis algorithm, the algorithm also covers the current mainstream and leading-edge machine learning algorithm, including the traditional classification, regression, clustering, association algorithm and deep learning algorithm, and also provides a new A-HMM algorithm. In the training process, an algorithm support module in the platform selects an algorithm according to the configuration file and the data type of the scene, such as a time series analysis algorithm and a clustering algorithm, and selects an optimal model through the expression of the model. And inputting the data to be tested into the model, and presenting the alarm result to the front end for analysis and reference of a user through judgment of an alarm API at the rear end. And the user selects different presentation modes through the dragging and pulling mode of the front end to finish the abnormal behavior analysis process.
In the process of establishing the model, algorithms used in different functional stages can be evaluated through test data, and the platform functional model evaluation module can determine different analysis algorithms recommended in different functional stages under different service scenes. The model evaluation module comprises evaluation indexes of different models, including indexes of a classification model, indexes of a regression model, indexes of a clustering model and the like, and the indexes (values) are visually displayed on an interface. Besides the existing evaluation index, the invention also provides a new model evaluation index CME. The existing evaluation index only aims at a single model, a business scene is not considered, and in some specific fields, such as a behavior analysis field, sample unbalance is a common problem, so that some problems, such as false alarm missing and false alarm, can occur if the common index is directly used. The CME index introduces scene knowledge into evaluation, the performance of the combined model is balanced by adding threat level factors, compared with the traditional index, the interpretability of the model is enhanced, the model can be adjusted according to self-adaptive factors and threat level factors to adapt to different scenes, and the evaluation indexes can be flexibly configured by calling background interfaces in the dragging and pulling process.
The platform can associate different service scenes in the database by using the associated fields through the multi-scene association module, and when an abnormal user triggers an alarm corresponding to a certain scene, potential abnormal behavior events in other scenes can be found. And at the moment, the platform outputs an alarm result in the scene, and the result is displayed at the front end.
The safety analysis modeling system provided by the embodiment of the invention can be customized, and the safety data normalization processing, the characteristic selection, the algorithm butt joint and the safety data alarm output can be realized in a page dragging and pulling mode, and a solution of a certain safety scene is formed by the serial connection of models. The whole process of the whole safety modeling is defined, the whole life cycle of the safety analysis modeling is run through, the aspects including data preprocessing, feature engineering, algorithm modeling, algorithm evaluation, algorithm production and the like are covered, the method is a complete solution of the safety analysis modeling, the application difficulty of the machine learning algorithm in the safety industry can be greatly reduced, and the use scene of the algorithm is expanded. And moreover, optimization is performed on the basis of the existing model, the usability of the model is improved, the evaluation of the algorithm can be realized from multiple angles such as performance factors, accuracy factors, data access breadth, scene support breadth and the like, and the accuracy of the platform detection and analysis result is improved.
Example 11:
on the basis of the foregoing embodiments, an embodiment of the present invention provides a user behavior analysis apparatus, and fig. 5 is a schematic diagram of a user behavior analysis apparatus provided in an embodiment of the present invention, as shown in fig. 5, the apparatus includes:
a determining module 51, configured to determine a first data source, and determine a first service scenario to which the first data source belongs, and at least one first analysis stage corresponding to the first data source, where the first data source includes first data to be analyzed; determining a first analysis algorithm corresponding to each first analysis stage in the at least one first analysis stage according to a first service scenario to which the first data source belongs;
the analysis module 52 is configured to perform user behavior analysis on the first data by using a first analysis algorithm corresponding to each first analysis stage according to the execution sequence of the at least one first analysis stage.
In a possible implementation, the determining module 51 is specifically configured to recommend, according to a first service scenario to which the first data source belongs, at least one analysis algorithm for each first analysis stage; and determining the analysis algorithm selected by the user in the at least one analysis algorithm as the first analysis algorithm corresponding to the first analysis stage.
In a possible implementation manner, the determining module 51 is specifically configured to recommend, according to the first service scenario to which the first data source belongs, at least one analysis algorithm for each first analysis stage, and output a recommendation level of each analysis algorithm.
In a possible embodiment, the apparatus further comprises:
a preprocessing module to identify a first data format of the first data source; preprocessing the first data according to the first data format, wherein different data formats correspond to different preprocessing modes;
the analysis module 52 is specifically configured to perform user behavior analysis on the preprocessed first data by using a first analysis algorithm corresponding to each first analysis stage according to the execution sequence of the at least one first analysis stage.
In one possible embodiment, the analysis algorithm includes at least one of the following algorithms: a rule analysis algorithm, a statistical analysis algorithm, a time series analysis algorithm, a machine learning algorithm, an improved hidden Markov model A-HMM algorithm.
In one possible implementation, the A-HMM algorithm satisfies the following equation:
Figure BDA0002873323210000281
wherein AT is the optimized state transition matrix, T is the state transition matrix, alpha is the attention weight, alpha is not less than 0 and not more than 1, beta is the attention factor, i is a positive integer not less than 1 and not more than N, and N is the total number of elements in the state transition matrix.
In a possible implementation, the determining module 51 is further configured to perform the following steps:
a: selecting the first data as the currently analyzed second data from at least one second data in the first data set;
b: analyzing the currently analyzed second data; if a single point anomaly is detected for the currently analyzed second data, increasing the current value of the alpha by a first value; if a context anomaly is detected for the currently analyzed second data, reducing the current value of the alpha by a second value; otherwise, keeping the current value of the alpha unchanged;
c: judging whether the currently analyzed second data is the last data; if not, executing D; if yes, executing E;
d: selecting the next data of the currently analyzed second data from the at least one second data as the currently analyzed second data, and returning to the step B;
e: determining a current value of the alpha as a final value of the alpha.
In a possible implementation manner, the determining module 51 is further configured to determine, for each third data in at least one third data in the second data set, an abnormal behavior threat level of the third data, and if the abnormal behavior threat level of the third data exceeds a threat level threshold, determine that the third data is malicious sample data; determining a second analysis algorithm corresponding to at least one second analysis stage in a second service scene and an analysis result of the second analysis algorithm on the malicious sample data, and determining a first score of the second analysis algorithm in the second analysis stage; determining a second score of a combination of at least one second analysis algorithm in the second service scenario according to the first score of the second analysis algorithm of each second analysis stage; for each second analysis algorithm in the at least one second analysis algorithm, determining a recommendation level of the second analysis algorithm in the corresponding second analysis stage in the second business scenario according to the first score and the second score of the second analysis algorithm.
In a possible embodiment, the apparatus further comprises:
and the updating module is used for updating the second data set by adopting an analysis result obtained by analyzing the user behavior of the first data.
In a possible embodiment, the apparatus further comprises:
and the alarm module is used for outputting alarm information of the abnormal behaviors existing in the first service scene and the third service scene if the abnormal behaviors existing in the first data and the third service scene associated with the first service scene are detected.
According to the embodiment of the invention, aiming at the first service scene to which the first data source belongs, the corresponding first analysis algorithm can be recommended for each first analysis stage in the first data source, and then the first analysis algorithm corresponding to each first analysis stage is adopted according to the execution sequence of each first analysis stage to perform user behavior analysis on the first data to be analyzed in the first data source.
Example 12:
on the basis of the foregoing embodiments, an embodiment of the present invention further provides an electronic device, and fig. 6 is a schematic structural diagram of the electronic device provided in the embodiment of the present invention, as shown in fig. 6, the electronic device includes: the system comprises a processor 61, a communication interface 62, a memory 63 and a communication bus 64, wherein the processor 61, the communication interface 62 and the memory 63 complete mutual communication through the communication bus 64;
the memory 63 has stored therein a computer program which, when executed by the processor 61, causes the processor 61 to perform the steps of:
determining a first data source, determining a first service scene to which the first data source belongs, and determining at least one first analysis stage corresponding to the first data source, wherein the first data source comprises first data to be analyzed;
determining a first analysis algorithm corresponding to each first analysis stage in the at least one first analysis stage according to a first service scenario to which the first data source belongs;
and according to the execution sequence of the at least one first analysis stage, adopting a first analysis algorithm corresponding to each first analysis stage to analyze the user behavior of the first data.
In a possible implementation, the processor 61 is specifically configured to recommend, for each first analysis phase, at least one analysis algorithm for the first analysis phase according to a first business scenario to which the first data source belongs; and determining the analysis algorithm selected by the user in the at least one analysis algorithm as the first analysis algorithm corresponding to the first analysis stage.
In a possible implementation, the processor 61 is specifically configured to recommend, according to the first service scenario to which the first data source belongs, at least one analysis algorithm for each first analysis stage, and output a recommendation level of each analysis algorithm.
In a possible embodiment, the processor is further configured to identify a first data format of the first data source; preprocessing the first data according to the first data format, wherein different data formats correspond to different preprocessing modes;
the processor 61 is specifically configured to perform user behavior analysis on the preprocessed first data by using a first analysis algorithm corresponding to each first analysis stage according to the execution sequence of the at least one first analysis stage.
In one possible embodiment, the analysis algorithm includes at least one of the following algorithms: a rule analysis algorithm, a statistical analysis algorithm, a time series analysis algorithm, a machine learning algorithm, an improved hidden Markov model A-HMM algorithm.
In one possible implementation, the A-HMM algorithm satisfies the following equation:
Figure BDA0002873323210000311
wherein AT is the optimized state transition matrix, T is the state transition matrix, alpha is the attention weight, alpha is not less than 0 and not more than 1, beta is the attention factor, i is a positive integer not less than 1 and not more than N, and N is the total number of elements in the state transition matrix.
In one possible embodiment, the processor 61 is further configured to:
a: selecting the first data as the currently analyzed second data from at least one second data in the first data set;
b: analyzing the currently analyzed second data; if a single point anomaly is detected for the currently analyzed second data, increasing the current value of the alpha by a first value; if a context anomaly is detected for the currently analyzed second data, reducing the current value of the alpha by a second value; otherwise, keeping the current value of the alpha unchanged;
c: judging whether the currently analyzed second data is the last data; if not, executing D; if yes, executing E;
d: selecting the next data of the currently analyzed second data from the at least one second data as the currently analyzed second data, and returning to the step B;
e: determining a current value of the alpha as a final value of the alpha.
In one possible embodiment, the processor 61 is further configured to:
determining an abnormal behavior threat level of at least one third data in a second data set aiming at each third data in the third data, and if the abnormal behavior threat level of the third data exceeds a threat level threshold value, determining the third data as malicious sample data;
determining a second analysis algorithm corresponding to at least one second analysis stage in a second service scene and an analysis result of the second analysis algorithm on the malicious sample data, and determining a first score of the second analysis algorithm in the second analysis stage;
determining a second score of a combination of at least one second analysis algorithm in the second service scenario according to the first score of the second analysis algorithm of each second analysis stage;
for each second analysis algorithm in the at least one second analysis algorithm, determining a recommendation level of the second analysis algorithm in the corresponding second analysis stage in the second business scenario according to the first score and the second score of the second analysis algorithm.
In one possible embodiment, the processor 61 is further configured to:
and updating the second data set by adopting an analysis result obtained by analyzing the user behavior of the first data.
In one possible embodiment, the processor 61 is further configured to:
and if the abnormal behavior is detected to exist in the first data and a third service scene related to the first service scene exists, outputting alarm information of the abnormal behavior existing in the first service scene and the third service scene.
Because the principle of the electronic device for solving the problems is similar to the user behavior analysis method, the implementation of the electronic device can be referred to the implementation of the method, and repeated details are not repeated.
The communication bus mentioned in the electronic device may be a Peripheral Component Interconnect (PCI) bus, an Extended Industry Standard Architecture (EISA) bus, or the like. The communication bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one thick line is shown, but this does not mean that there is only one bus or one type of bus.
The communication interface 62 is used for communication between the above-described electronic apparatus and other apparatuses.
The Memory may include a Random Access Memory (RAM) or a Non-Volatile Memory (NVM), such as at least one disk Memory. Alternatively, the memory may be at least one memory device located remotely from the processor.
The Processor may be a general-purpose Processor, including a central processing unit, a Network Processor (NP), and the like; but may also be a Digital instruction processor (DSP), an application specific integrated circuit, a field programmable gate array or other programmable logic device, discrete gate or transistor logic, discrete hardware components, or the like.
According to the embodiment of the invention, aiming at the first service scene to which the first data source belongs, the corresponding first analysis algorithm can be recommended for each first analysis stage in the first data source, and then the first analysis algorithm corresponding to each first analysis stage is adopted according to the execution sequence of each first analysis stage to perform user behavior analysis on the first data to be analyzed in the first data source.
Example 13:
on the basis of the foregoing embodiments, an embodiment of the present invention provides a computer-readable storage medium, in which a computer program executable by an electronic device is stored, and when the program is run on the electronic device, the electronic device is caused to execute the following steps:
determining a first data source, determining a first service scene to which the first data source belongs, and determining at least one first analysis stage corresponding to the first data source, wherein the first data source comprises first data to be analyzed;
determining a first analysis algorithm corresponding to each first analysis stage in the at least one first analysis stage according to a first service scenario to which the first data source belongs;
and according to the execution sequence of the at least one first analysis stage, adopting a first analysis algorithm corresponding to each first analysis stage to analyze the user behavior of the first data.
In a possible implementation, the determining, according to the first business scenario to which the first data source belongs, a first analysis algorithm corresponding to each first analysis stage in the at least one first analysis stage includes:
recommending at least one analysis algorithm for each first analysis stage according to the first service scene to which the first data source belongs; and determining the analysis algorithm selected by the user in the at least one analysis algorithm as the first analysis algorithm corresponding to the first analysis stage.
In a possible implementation manner, the recommending, for each first analysis stage according to the first business scenario to which the first data source belongs, at least one analysis algorithm for the first analysis stage includes:
and recommending at least one analysis algorithm for each first analysis stage according to the first service scene to which the first data source belongs, and outputting the recommended level of each analysis algorithm.
In a possible implementation, after determining the first data source, the method further includes:
identifying a first data format of the first data source;
preprocessing the first data according to the first data format, wherein different data formats correspond to different preprocessing modes;
according to the execution sequence of the at least one first analysis stage, performing user behavior analysis on the first data by adopting a first analysis algorithm corresponding to each first analysis stage, including:
and according to the execution sequence of the at least one first analysis stage, adopting a first analysis algorithm corresponding to each first analysis stage to analyze the user behavior of the preprocessed first data.
In one possible embodiment, the analysis algorithm includes at least one of the following algorithms: a rule analysis algorithm, a statistical analysis algorithm, a time series analysis algorithm, a machine learning algorithm, an improved hidden Markov model A-HMM algorithm.
In one possible implementation, the A-HMM algorithm satisfies the following equation:
Figure BDA0002873323210000341
wherein AT is the optimized state transition matrix, T is the state transition matrix, alpha is the attention weight, alpha is not less than 0 and not more than 1, beta is the attention factor, i is a positive integer not less than 1 and not more than N, and N is the total number of elements in the state transition matrix.
In one possible embodiment, the method further comprises:
a: selecting the first data as the currently analyzed second data from at least one second data in the first data set;
b: analyzing the currently analyzed second data; if a single point anomaly is detected for the currently analyzed second data, increasing the current value of the alpha by a first value; if a context anomaly is detected for the currently analyzed second data, reducing the current value of the alpha by a second value; otherwise, keeping the current value of the alpha unchanged;
c: judging whether the currently analyzed second data is the last data; if not, executing D; if yes, executing E;
d: selecting the next data of the currently analyzed second data from the at least one second data as the currently analyzed second data, and returning to the step B;
e: determining a current value of the alpha as a final value of the alpha.
In one possible embodiment, the method further comprises:
determining an abnormal behavior threat level of at least one third data in a second data set aiming at each third data in the third data, and if the abnormal behavior threat level of the third data exceeds a threat level threshold value, determining the third data as malicious sample data;
determining a second analysis algorithm corresponding to at least one second analysis stage in a second service scene and an analysis result of the second analysis algorithm on the malicious sample data, and determining a first score of the second analysis algorithm in the second analysis stage;
determining a second score of a combination of at least one second analysis algorithm in the second service scenario according to the first score of the second analysis algorithm of each second analysis stage;
for each second analysis algorithm in the at least one second analysis algorithm, determining a recommendation level of the second analysis algorithm in the corresponding second analysis stage in the second business scenario according to the first score and the second score of the second analysis algorithm.
In one possible embodiment, the method further comprises:
and updating the second data set by adopting an analysis result obtained by analyzing the user behavior of the first data.
In one possible embodiment, the method further comprises:
and if the abnormal behavior is detected to exist in the first data and a third service scene related to the first service scene exists, outputting alarm information of the abnormal behavior existing in the first service scene and the third service scene.
According to the embodiment of the invention, aiming at the first service scene to which the first data source belongs, the corresponding first analysis algorithm can be recommended for each first analysis stage in the first data source, and then the first analysis algorithm corresponding to each first analysis stage is adopted according to the execution sequence of each first analysis stage to perform user behavior analysis on the first data to be analyzed in the first data source.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
It will be apparent to those skilled in the art that various changes and modifications may be made in the present application without departing from the spirit and scope of the application. Thus, if such modifications and variations of the present application fall within the scope of the claims of the present application and their equivalents, the present application is intended to include such modifications and variations as well.

Claims (10)

1. A method for analyzing user behavior, the method comprising:
determining a first data source, determining a first service scene to which the first data source belongs, and determining at least one first analysis stage corresponding to the first data source, wherein the first data source comprises first data to be analyzed;
determining a first analysis algorithm corresponding to each first analysis stage in the at least one first analysis stage according to a first service scenario to which the first data source belongs;
and according to the execution sequence of the at least one first analysis stage, adopting a first analysis algorithm corresponding to each first analysis stage to analyze the user behavior of the first data.
2. The method of claim 1, wherein determining the first analysis algorithm corresponding to each of the at least one first analysis stage according to the first business scenario to which the first data source belongs comprises:
recommending at least one analysis algorithm for each first analysis stage according to the first service scene to which the first data source belongs; and determining the analysis algorithm selected by the user in the at least one analysis algorithm as the first analysis algorithm corresponding to the first analysis stage.
3. The method of claim 2, wherein recommending at least one analysis algorithm for each first analysis stage according to the first business scenario to which the first data source belongs comprises:
and recommending at least one analysis algorithm for each first analysis stage according to the first service scene to which the first data source belongs, and outputting the recommended level of each analysis algorithm.
4. The method of any of claims 1-3, wherein after determining the first data source, further comprising:
identifying a first data format of the first data source;
preprocessing the first data according to the first data format, wherein different data formats correspond to different preprocessing modes;
according to the execution sequence of the at least one first analysis stage, performing user behavior analysis on the first data by adopting a first analysis algorithm corresponding to each first analysis stage, including:
and according to the execution sequence of the at least one first analysis stage, adopting a first analysis algorithm corresponding to each first analysis stage to analyze the user behavior of the preprocessed first data.
5. The method of any of claims 1-3, wherein the analysis algorithm comprises an improved hidden Markov model A-HMM algorithm;
the A-HMM algorithm satisfies the following equation:
Figure FDA0002873323200000021
wherein AT is the optimized state transition matrix, T is the state transition matrix, alpha is the attention weight, alpha is not less than 0 and not more than 1, beta is the attention factor, i is a positive integer not less than 1 and not more than N, and N is the total number of elements in the state transition matrix.
6. The method of claim 5, further comprising:
a: selecting the first data as the currently analyzed second data from at least one second data in the first data set;
b: analyzing the currently analyzed second data; if a single point anomaly is detected for the currently analyzed second data, increasing the current value of the alpha by a first value; if a context anomaly is detected for the currently analyzed second data, reducing the current value of the alpha by a second value; otherwise, keeping the current value of the alpha unchanged;
c: judging whether the currently analyzed second data is the last data; if not, executing D; if yes, executing E;
d: selecting the next data of the currently analyzed second data from the at least one second data as the currently analyzed second data, and returning to the step B;
e: determining a current value of the alpha as a final value of the alpha.
7. The method of claim 3, further comprising:
determining an abnormal behavior threat level of at least one third data in a second data set aiming at each third data in the third data, and if the abnormal behavior threat level of the third data exceeds a threat level threshold value, determining the third data as malicious sample data;
determining a second analysis algorithm corresponding to at least one second analysis stage in a second service scene and an analysis result of the second analysis algorithm on the malicious sample data, and determining a first score of the second analysis algorithm in the second analysis stage;
determining a second score of a combination of at least one second analysis algorithm in the second service scenario according to the first score of the second analysis algorithm of each second analysis stage;
for each second analysis algorithm in the at least one second analysis algorithm, determining a recommendation level of the second analysis algorithm in the corresponding second analysis stage in the second business scenario according to the first score and the second score of the second analysis algorithm.
8. A user behavior analysis apparatus, characterized in that the apparatus comprises:
the system comprises a determining module, a determining module and a analyzing module, wherein the determining module is used for determining a first data source, determining a first service scene to which the first data source belongs and at least one first analyzing stage corresponding to the first data source, and the first data source comprises first data to be analyzed; determining a first analysis algorithm corresponding to each first analysis stage in the at least one first analysis stage according to a first service scenario to which the first data source belongs;
and the analysis module is used for performing user behavior analysis on the first data by adopting a first analysis algorithm corresponding to each first analysis stage according to the execution sequence of the at least one first analysis stage.
9. An electronic device, characterized in that the electronic device comprises at least a processor and a memory, the processor being adapted to carry out the steps of the user behavior analysis method according to any of claims 1-7 when executing a computer program stored in the memory.
10. A computer-readable storage medium, characterized in that it stores a computer program which, when being executed by a processor, carries out the steps of the user behavior analysis method according to any one of claims 1 to 7.
CN202011612631.0A 2020-12-30 2020-12-30 User behavior analysis method, device, equipment and medium Active CN112733015B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011612631.0A CN112733015B (en) 2020-12-30 2020-12-30 User behavior analysis method, device, equipment and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011612631.0A CN112733015B (en) 2020-12-30 2020-12-30 User behavior analysis method, device, equipment and medium

Publications (2)

Publication Number Publication Date
CN112733015A true CN112733015A (en) 2021-04-30
CN112733015B CN112733015B (en) 2024-06-14

Family

ID=75610283

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011612631.0A Active CN112733015B (en) 2020-12-30 2020-12-30 User behavior analysis method, device, equipment and medium

Country Status (1)

Country Link
CN (1) CN112733015B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114756865A (en) * 2022-04-24 2022-07-15 安天科技集团股份有限公司 RDP file security detection method and device, electronic equipment and storage medium

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103942575A (en) * 2014-04-02 2014-07-23 公安部第三研究所 System and method for analyzing intelligent behaviors based on scenes and Markov logic network
US20160019298A1 (en) * 2014-07-15 2016-01-21 Microsoft Corporation Prioritizing media based on social data and user behavior
US9798788B1 (en) * 2012-12-27 2017-10-24 EMC IP Holding Company LLC Holistic methodology for big data analytics
CN107844634A (en) * 2017-09-30 2018-03-27 平安科技(深圳)有限公司 Polynary universal model platform modeling method, electronic equipment and computer-readable recording medium
CN108334530A (en) * 2017-08-24 2018-07-27 平安普惠企业管理有限公司 User behavior information analysis method, equipment and storage medium
WO2019007306A1 (en) * 2017-07-06 2019-01-10 众安信息技术服务有限公司 Method, device and system for detecting abnormal behavior of user
CN110633569A (en) * 2019-09-27 2019-12-31 上海赛可出行科技服务有限公司 Hidden Markov model-based user behavior and entity behavior analysis method
US20200137083A1 (en) * 2018-10-24 2020-04-30 Nec Laboratories America, Inc. Unknown malicious program behavior detection using a graph neural network

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9798788B1 (en) * 2012-12-27 2017-10-24 EMC IP Holding Company LLC Holistic methodology for big data analytics
CN103942575A (en) * 2014-04-02 2014-07-23 公安部第三研究所 System and method for analyzing intelligent behaviors based on scenes and Markov logic network
US20160019298A1 (en) * 2014-07-15 2016-01-21 Microsoft Corporation Prioritizing media based on social data and user behavior
WO2019007306A1 (en) * 2017-07-06 2019-01-10 众安信息技术服务有限公司 Method, device and system for detecting abnormal behavior of user
CN108334530A (en) * 2017-08-24 2018-07-27 平安普惠企业管理有限公司 User behavior information analysis method, equipment and storage medium
CN107844634A (en) * 2017-09-30 2018-03-27 平安科技(深圳)有限公司 Polynary universal model platform modeling method, electronic equipment and computer-readable recording medium
US20200137083A1 (en) * 2018-10-24 2020-04-30 Nec Laboratories America, Inc. Unknown malicious program behavior detection using a graph neural network
CN110633569A (en) * 2019-09-27 2019-12-31 上海赛可出行科技服务有限公司 Hidden Markov model-based user behavior and entity behavior analysis method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
宋礼: "出租车轨迹异常检测可视分析系统的研究与实现", 中国硕士学位论文全文数据库 信息科技辑 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114756865A (en) * 2022-04-24 2022-07-15 安天科技集团股份有限公司 RDP file security detection method and device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN112733015B (en) 2024-06-14

Similar Documents

Publication Publication Date Title
US11595415B2 (en) Root cause analysis in multivariate unsupervised anomaly detection
US9405427B2 (en) Adaptive user interface using machine learning model
US20230385034A1 (en) Automated decision making using staged machine learning
CN110971659A (en) Recommendation message pushing method and device and storage medium
US11580425B2 (en) Managing defects in a model training pipeline using synthetic data sets associated with defect types
CN111340233B (en) Training method and device of machine learning model, and sample processing method and device
CN103365829A (en) Information processing apparatus, information processing method, and program
CN110781919A (en) Classification model training method, classification device and classification equipment
US10404524B2 (en) Resource and metric ranking by differential analysis
CN112686521B (en) Wind control rule optimizing method and system
US11587330B2 (en) Visual analytics platform for updating object detection models in autonomous driving applications
CN110909005B (en) Model feature analysis method, device, equipment and medium
CN110717509A (en) Data sample analysis method and device based on tree splitting algorithm
CN113722134A (en) Cluster fault processing method, device and equipment and readable storage medium
WO2018036402A1 (en) Method and device for determining key variable in model
CN118761474B (en) Data processing method, electronic device and computer readable storage medium
CN112733015A (en) User behavior analysis method, device, equipment and medium
CN112541447B (en) Updating method, device, medium and equipment of machine model
CN111651753A (en) User behavior analysis system and method
CN117370181A (en) Code optimization method and device, terminal equipment and computer readable storage medium
CN117076244A (en) Method, device, equipment and storage medium for generating host running state information
CN109284354B (en) Script searching method and device, computer equipment and storage medium
CN109408531B (en) Method and device for detecting slow-falling data, electronic equipment and storage medium
CN113268419A (en) Method, device, equipment and storage medium for generating test case optimization information
CN114462417A (en) Comment text processing method applied to big data and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant