[go: up one dir, main page]

CN111737320B - Group user behavior baseline establishment method and device and computer equipment - Google Patents

Group user behavior baseline establishment method and device and computer equipment Download PDF

Info

Publication number
CN111737320B
CN111737320B CN202010621812.3A CN202010621812A CN111737320B CN 111737320 B CN111737320 B CN 111737320B CN 202010621812 A CN202010621812 A CN 202010621812A CN 111737320 B CN111737320 B CN 111737320B
Authority
CN
China
Prior art keywords
user
group
behavior
establishing
baseline
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010621812.3A
Other languages
Chinese (zh)
Other versions
CN111737320A (en
Inventor
罗振珊
唐炳武
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Property and Casualty Insurance Company of China Ltd
Original Assignee
Ping An Property and Casualty Insurance Company of China Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Property and Casualty Insurance Company of China Ltd filed Critical Ping An Property and Casualty Insurance Company of China Ltd
Priority to CN202010621812.3A priority Critical patent/CN111737320B/en
Publication of CN111737320A publication Critical patent/CN111737320A/en
Application granted granted Critical
Publication of CN111737320B publication Critical patent/CN111737320B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2465Query processing support for facilitating data mining operations in structured databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Probability & Statistics with Applications (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Databases & Information Systems (AREA)
  • Fuzzy Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application discloses a method, a device and a computer device for establishing a group user behavior baseline. By using the method provided by the embodiment of the application, a plurality of group user behavior baselines aiming at different types of users can be quickly established.

Description

Group user behavior baseline establishment method and device and computer equipment
Technical Field
The present application relates to the field of data mining, and in particular, to a method, an apparatus, and a computer device for establishing a group user behavior baseline.
Background
With the rapid development of network application technology, the network behaviors of users are more and more diversified, and how to identify the behaviors of network users and discover abnormal behavior events, so that ensuring the security of the network is more and more important. At present, whether a user has abnormal behavior events or not is judged mainly by establishing a behavior baseline of the individual and then by the behavior baseline of the individual. However, for convenience in management, the same group user behavior base line is generally used for managing the same department or group, but the working mode, habit and the like of each person are different, so that mismanagement may occur due to mismatching of the group user behavior base line and the person, and in order to solve the problem, a method for establishing behavior base lines corresponding to different types of people is needed, but how to quickly establish behavior base lines for different types of people is not yet well done.
Disclosure of Invention
The application mainly aims to provide a method, a device, computer equipment and a storage medium for establishing a group user behavior baseline, and aims to solve the problem that the behavior baseline aiming at different types of groups cannot be established rapidly in the prior art.
In order to achieve the above object, the present application provides a method for establishing a group user behavior baseline, including:
Acquiring a user portrait of each user and an individual behavior baseline corresponding to the user portrait, wherein the user portrait is a portrait constructed based on the specified information of the user and log history data corresponding to the user in a specified time period;
clustering calculation is carried out on all the user portraits to obtain user groups of different categories;
Based on individual behavior baselines of different users in the user group of the same category, a corresponding group user behavior baseline is established.
Further, the method for acquiring the individual behavior baseline corresponding to the user image comprises the following steps:
acquiring log history data of the user and appointed information of the user;
obtaining dates corresponding to all pieces of data in the log historical data;
Classifying the data with the date being the working day to obtain working day log historical data, and classifying the data with the date being the holiday to obtain holiday log historical data;
Establishing a working day individual behavior baseline of the user according to the working day log historical data and the user specified information, and establishing a working day individual behavior baseline of the user according to the holiday log historical data and the user specified information.
Further, the step of establishing a corresponding group user behavior baseline based on individual behavior baselines of different users in the same class of user group further comprises:
Removing abnormal data in individual baselines of different users in the user group of the same category by adopting an orphan forest algorithm;
and establishing the group user behavior base line by utilizing each individual behavior base line after abnormal data are removed.
Further, after the step of establishing the corresponding group user behavior base line based on the individual behavior base lines of the different users in the same class user group, the method further comprises:
acquiring a current behavior log of a current period of a first user and a user portrait of the first user;
extracting a specified characteristic value of the current behavior log, wherein the specified characteristic value is a characteristic value required to be reflected in the group user behavior base line; and determining a user group category of the first user according to the user portrait of the first user;
Comparing the appointed characteristic value with a reference characteristic value corresponding to the appointed characteristic in a first group user behavior baseline, wherein the first group user behavior baseline is a group user behavior baseline corresponding to a user group category to which the first user belongs;
And if the comparison result meets the condition of triggering risk early warning, sending out alarm information.
Further, after the step of sending out the alarm information if the comparison result meets the condition of triggering the risk early warning, the method further includes:
judging whether the appointed characteristic value reaches a preset abnormal data threshold value or not;
if not, marking the appointed characteristic on the individual behavior base line corresponding to the first user.
Further, after the step of labeling the specified feature on the individual behavior base line corresponding to the first user, the method further includes:
judging whether the marked times of the features on the individual behavior baselines corresponding to the first user reach a preset quantity value or not;
If yes, reconstructing an individual behavior baseline corresponding to the first user.
Further, in one embodiment, after the step of establishing the group user behavior baseline based on the individual behavior baselines of the different users in the same class of user group, the method further includes:
And associating the users with the categories by using association rules.
The application also provides a device for establishing the group user behavior base line, which comprises the following steps:
An acquisition unit, configured to acquire a user portrait of each user, and an individual behavior baseline corresponding to the user portrait, where the user portrait is a portrait constructed based on specified information of the user and log history data corresponding to the user in a specified period of time;
the clustering unit is used for carrying out clustering calculation on all the user portraits to obtain user groups of different categories;
The establishing unit is used for establishing corresponding group user behavior baselines based on individual behavior baselines of different users in the user group of the same category.
The application also provides a computer device comprising a memory storing a computer program and a processor implementing the steps of the method of any of claims 1 to 7 when the computer program is executed.
The application also provides a computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the method of any of claims 1 to 7.
When the establishment method is realized, the user portraits are established firstly, then the users are classified through the user portraits, and finally, the group user behavior baselines of the same class are established based on the individual behavior baselines of the users. By using the method provided by the embodiment of the application, a plurality of group user behavior baselines aiming at different types of users can be quickly established.
Drawings
FIG. 1 is a flow chart of a method for establishing a group user behavior baseline according to an embodiment of the application;
FIG. 2 is a block diagram schematically illustrating a device for establishing a group user behavior baseline according to an embodiment of the present application;
Fig. 3 is a schematic block diagram of a computer device according to an embodiment of the present application.
The achievement of the objects, functional features and advantages of the present application will be further described with reference to the accompanying drawings, in conjunction with the embodiments.
Detailed Description
The present application will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present application more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the application.
Referring to fig. 1, an embodiment of the present application provides a method for establishing a group user behavior baseline, including:
S1, obtaining a user portrait of each user and an individual behavior baseline corresponding to the user portrait, wherein the user portrait is a portrait constructed based on the specified information of the user and log history data corresponding to the user in a specified time period;
S2, carrying out clustering calculation on all the user portraits to obtain user groups of different categories;
S3, establishing corresponding group user behavior baselines based on individual behavior baselines of different users in the user group of the same category.
In this embodiment, the server acquires personal information of each user, establishes a user portrait by acquiring log history data of each user and specified information of the user, and tags the log history data, and the server can quickly read the information in the user portrait by the tag and establish individual behavior baselines of the user, performs cluster analysis on the basis of the user portrait to obtain groups of different categories, and then establishes group user behavior baselines of the groups of different categories.
As described in step S1, the server obtains the specified information of each user, which mainly includes the gender, age, department, post information, academic information, etc. of the user. Then, by acquiring log history data of each user and combining the user specification information, a user portrait is established, and the labels of the user portrait comprise ① user types (including salesmen, internal staff, car account numbers and other four types); ② Workday activity, workday activity= (days of accessing the specified system in the past 90 workdays)/total workdays; ③ Holiday activity, holiday activity= (days of past 90 days holidays (including weekends, legal holidays) access the specified system)/total holiday days; ④ Diligence index = calculate total overtime/total days; ⑤ Whether there is abnormal behavior or not, and matching the results of other abnormal detection models. The above specified system may be PNBS (safe production insurance business new core system) or the like. The user type, the weekday activity, the holiday activity, the diligence index and the like are obtained through the log historical data.
The individual behaviors of the users are obtained by extracting specified features from the log history data, for example, an individual behavior baseline is established based on log history data of the past 90 days, and the extracted specified features comprise: ① Total access frequency per day, including mean, standard deviation, Q1, Q3, maximum, minimum; ② The number of SESSION_ID/day comprises a mean value, a standard deviation, Q1, Q3, a maximum value and a minimum value; ③ IP number/day, including mean, standard deviation, Q1, Q3, maximum, minimum; ④ The number of price polls per day comprises a mean value, a standard deviation, Q1, Q3, a maximum value and a minimum value; ⑤ Search times/day, including mean, standard deviation, Q1, Q3, maximum, minimum; ⑥ The number of times per day of insurance tracking, including mean, standard deviation, Q1, Q3, maximum, minimum; ⑦ The number of HTTP access failures per day includes mean, standard deviation, Q1, Q3, maximum, minimum. When the mean value and the standard deviation of the data are calculated, in order to avoid the influence of noise data, a quartile range method is adopted to remove the noise data. Wherein, Q1 and Q3 are the first quartile (Q1), also called "smaller quartile", of the quartiles, Q1 and Q3, which is the 25 th number after all the values in the sample are arranged from small to large. The second quartile (Q2), also known as the "median", is equal to the 50% number after all values in the sample are arranged from small to large. The third quartile (Q3), also known as the "greater quartile", is equal to the 75% number after all values in the sample are arranged from small to large. The difference between the third quartile and the first quartile is also known as the quartile range (InterQuartile Range, IQR).
When all user images are obtained and clustered, the best clustering number is determined by the elbow method to determine the clustering number, then the Kmeans algorithm is adopted to cluster the user images, the specific working process is that K points are selected as initial clustering centers, each object is distributed to the nearest centers to form K clusters, the center of each cluster is recalculated, the iteration steps are repeated until the clusters are not changed or the appointed iteration times are reached, and finally a plurality of user groups of different categories are obtained. The elbow method is a method of removing the top cluster number commonly found in Kmeans calculation, and is not described herein.
As described in step S3, according to the individual behavior base line of each user in the same user group, the group user behavior base line of the user group is established, so that the behavior base line suitable for the user group can be obtained, and the judgment base line is more moderate in subsequent use, thereby being convenient for popularization and use. In the application, the characteristics of the group user behavior baselines are the same as those of the individual user behavior baselines, and only the specific corresponding numerical values are changed. In a particular embodiment, each characteristic value in the group user behavior baseline may be an average of the characteristic values in each individual user behavior baseline in the group, or the like.
In one embodiment, the method for obtaining the individual behavior baseline corresponding to the user image includes:
Acquiring log history data of the user;
obtaining the date corresponding to each piece of data in the log history data,
Classifying the data with the date being the working day to obtain working day log historical data, and classifying the data with the date being the holiday to obtain holiday log historical data;
Establishing a working day individual behavior baseline of the user according to the working day log historical data and the user specified information, and establishing a working day individual behavior baseline of the user according to the holiday log historical data and the user specified information.
In this embodiment, since the behavior baselines of the weekday and holiday are different, analysis needs to be performed separately, and by calling the hundred degree interface http:// www.easybots.cn/api/holiday.php on Java, the interface can determine whether a given date is the weekday or holiday. Further, when the group user behavior base line is established, the working day group user behavior base line, the holiday group user behavior base line and the like can be established according to the requirement. For example, when establishing a working day group user behavior baseline, selecting a working day individual behavior baseline, and establishing a holiday group user behavior baseline, selecting a holiday individual behavior baseline.
In one embodiment, the step S3 of establishing a corresponding group user behavior baseline based on the individual behavior baselines of different users in the same group of users further includes:
s301, eliminating abnormal data in individual baselines of different users in the same class of user groups by adopting an orphan forest algorithm;
s302, establishing the group user behavior base line by utilizing each individual behavior base line after abnormal data are removed.
In this embodiment, the orphan forest algorithm (iForest) is commonly used to mine abnormal data, such as attack detection and traffic anomaly analysis in network security, and the financial institution is used to mine fraud. The algorithm has low memory requirements, high processing speed and linear time complexity. High-dimensional data and big data can be well processed, and the method can also be used for online anomaly detection. Abnormal data refers to interference data, for example, the operation times of a certain user on a certain day can be particularly large or particularly small, and the obvious abnormal data can influence the result of data analysis, so that an orphan forest algorithm can be adopted to remove the interference data when the mean value and the standard deviation are calculated. For example: a user normally logs in to the A webpage 1-2 times a day, but on a certain day, for some reasons, it is necessary to repeat the login a plurality of times, 50 times in total, and 50 times are abnormal data. And establishing a group user behavior baseline by utilizing each individual behavior baseline after abnormal data are removed, wherein the obtained group user behavior baseline is more accurate and has stronger practicability.
In one embodiment, after the step S3 of establishing the corresponding group user behavior baseline based on the individual behavior baselines of the different users in the same user group, the method further includes:
acquiring a current behavior log of a current period of a first user and a user portrait of the first user;
extracting a specified characteristic value of the current behavior log, wherein the specified characteristic value is a characteristic value required to be reflected in the group user behavior base line; and determining a user group category of the first user according to the user portrait of the first user;
Comparing the appointed characteristic value with a reference characteristic value corresponding to the appointed characteristic in a first group user behavior baseline, wherein the first group user behavior baseline is a group user behavior baseline corresponding to a user group category to which the first user belongs;
And if the comparison result meets the condition of triggering risk early warning, sending out alarm information.
In this embodiment, the individual behavior baseline and the group user behavior baseline are unified within a set period, such as a day behavior baseline, a week behavior baseline, a quarter behavior baseline, and the like, where the current period is the current period, and is generally not yet completed. The comparison method is that the number of times of logging in the website a in a range space, such as a period, is an appointed characteristic, the corresponding reference characteristic value is 5 times, and when the appointed characteristic value is not more than 7 times, risk early warning can not be started, that is, the condition for triggering the risk early warning is that the appointed characteristic value is more than 8. In another embodiment, the formula is q1+1.5 (Q3-Q1) as the trigger threshold, when the specified feature value > q1+1.5 (Q3-Q1), the feature is considered to deviate from the individual behavior baseline, and the risk alarm is automatically triggered and an instruction is sent to the server, and the server performs identification judgment on the feature. Q1 and Q3 are Q1 and Q3 in the quartile, and are not described herein.
In one embodiment, if the comparison result meets the condition of triggering risk early warning, the step of sending out alarm information further includes:
judging whether the appointed characteristic value reaches a preset abnormal data threshold value or not;
if not, marking the appointed characteristic on the individual behavior base line corresponding to the first user.
In this embodiment, after the risk alert is triggered, the server determines whether the specified feature value reaches the preset abnormal data threshold, if yes, the abnormal data in the individual behavior baseline is removed, if no, the feature is marked, because the behavior habit of the first user may change, for example, the number of times the first user logs in the a webpage every day before is 1-4, but logs in 8 times today, and if the warning is triggered but the abnormal data threshold is not reached, the feature is marked, so that the follow-up tracking processing of the data is facilitated.
In one embodiment, after the step of labeling the specified feature on the individual behavior base line corresponding to the first user, the method further includes:
judging whether the marked times of the features on the individual behavior baselines corresponding to the first user reach a preset quantity value or not;
If yes, reconstructing an individual behavior baseline corresponding to the first user.
In this embodiment, when the number of feature labels reaches a preset threshold, it indicates that the personal behavior of the first user changes, and the labeled number of times includes a sum of the labeled times of each labeled feature. For example: the number of times of logging in the A webpage before the first user logs in is 1-4 times, but 7 times are logged in today, if the warning is triggered but the abnormal data threshold is not reached, the characteristic is marked once, if the number of times of logging in the A webpage in the next N days of the first user is 7-10, the characteristic of logging in the A webpage is marked n+1 times, other characteristics can be also be marked M times in the period, the marked times are equal to n+1+m (M and N are positive integers), when the marked times reach the preset threshold, the personal behavior of the user is determined to be changed, and the individual behavior baseline of the first user needs to be re-established.
In one embodiment, after the step S3 of establishing the corresponding group user behavior baseline based on the individual behavior baselines of the different users in the same user group, the method further includes:
And associating the users with the categories by using association rules.
In this embodiment, association rules are an important issue in data mining for mining correlations between valuable data items from a large amount of data. Common problems solved by association rules are: "if a consumer purchased product a, then how much will he purchase product B? "and" if he purchased products C and D, then he will also purchase what products? "the same data features may be observed from different dimensions, such as date, region, channel, product, user, etc., which are dimensions, a 3D model is built, more than 80 classes are obtained through clustering in step S3, and the main features of each class can be approximately known, such as class a after the examination is finished: the main characteristics of the subjects a are 'bad', B: the main characteristics of the subjects b are "excellent", class C: c teacher in class a subject, class D: d teacher's class b subjects, all need the manual work to classify before, just can obtain the relation between A class and the C class, and owing to the repeated deviation that still probably appears in work, can obtain the a subject that C teacher's was in class through the association rule, the student is mostly "not passing", so alright excavate analysis and obtain that C teacher has obvious problem in the aspect of the teaching of a subject, need to correct from this to provide powerful technological basis and support for urging C teacher to improve and improve the teaching effect. The users have relevance, so that the reasons of high or low performance, efficiency and the like can be analyzed according to the relevance, and a reference is provided for solving the problem. For example, two persons in an industry are colleagues, and are related to each other in the downstream in terms of business logic, and are related together by the above-described association rule, and if the downstream work is inefficient, the upstream progress may be affected, etc. Further, in the application, the association relation for each group of the group users can be established through the association rule, and the association between each group of users is analyzed, so that the relation of how the different user groups should be matched and the like is mined based on the group user behavior base line of each group of users, and the specific analysis method is different according to different industries and different analysis purposes and the like and is not described in detail herein. In the application, the user category is imaged, and then the association relations are connected through colored arrows and the like, so that the user can conveniently check, analyze and use.
According to the method for establishing the group user behavior base line, the user portrait is established firstly, then the users are classified through the user portrait, and then the group user behavior base line of the same class is established based on the individual behavior base line of the users. By using the method provided by the embodiment of the application, a plurality of group user behavior baselines aiming at different types of users can be quickly established.
Referring to fig. 2, the embodiment of the present application further provides a device for establishing a group user behavior baseline, including:
An acquisition unit 10 for acquiring a user portraits of each user, which are portraits constructed based on the designation information of the user and log history data corresponding to the user in a designated period, and individual behavior baselines corresponding to the user portraits;
A clustering unit 20, configured to perform a clustering calculation on all the user portraits to obtain user groups of different categories;
The establishing unit 30 is configured to establish a corresponding group user behavior baseline based on individual behavior baselines of different users in the same class of user group.
In one embodiment, the device for establishing a group user behavior baseline further includes:
a log obtaining unit, configured to obtain log history data of the user;
a date acquisition unit for acquiring a date corresponding to each piece of data in the log history data,
The classification unit is used for classifying the data with the date being the working day to obtain the working day log historical data, and classifying the data with the date being the holiday to obtain the holiday log historical data;
The individual behavior base line establishing unit is used for establishing a working day individual behavior base line of the user according to the working day log historical data and the user specified information, and establishing a holiday individual behavior base line of the user according to the holiday log historical data and the user specified information.
In one embodiment, the establishing unit 30 further includes:
The abnormal eliminating module is used for eliminating abnormal data in individual baselines of different users in the same class of user groups by adopting an orphan forest algorithm;
the establishing module is used for establishing the group user behavior base line by utilizing each individual behavior base line after the abnormal data are removed.
In one embodiment, the device for establishing a group user behavior baseline further includes:
The first acquisition unit is used for acquiring a current behavior log of a current period of a first user and a user portrait of the first user;
The extraction unit is used for extracting the appointed characteristic value of the current behavior log, wherein the appointed characteristic value is the characteristic value required to be reflected in the group user behavior base line; and determining a user group category of the first user according to the user portrait of the first user;
the comparison unit is used for comparing the appointed characteristic value with a reference characteristic value corresponding to the appointed characteristic in a first group user behavior base line, wherein the first group user behavior base line is a group user behavior base line corresponding to a user group category to which the first user belongs;
and the alarm unit is used for sending alarm information if the comparison result meets the condition of triggering risk early warning.
In one embodiment, the device for establishing a group user behavior baseline further includes:
the first judging unit is used for judging whether the specified characteristic value reaches a preset abnormal data threshold value or not;
And the labeling unit is used for labeling the appointed characteristics on the individual behavior base line corresponding to the first user if not.
In one embodiment, the device for establishing a group user behavior baseline further includes:
the second judging unit is used for judging whether the number of times of marking the characteristics on the individual behavior base line corresponding to the first user reaches a preset quantity value or not;
and the reconstruction unit is used for reconstructing the individual behavior base line corresponding to the first user if the user is in the first state.
In one embodiment, the device for establishing a group user behavior baseline further includes:
And the association unit is used for associating the users with the categories by using association rules.
The units, modules, and the like in the above embodiments are devices that correspondingly perform the methods in the above embodiments.
The device for establishing the group user behavior base line firstly establishes the user portrait, classifies the users through the user portrait, and establishes the group user behavior base line of the same class based on the individual behavior base line of the users. By using the method provided by the embodiment of the application, a plurality of group user behavior baselines aiming at different types of users can be quickly established.
Referring to fig. 3, in an embodiment of the present application, there is further provided a computer device, which may be a server, and an internal structure thereof may be as shown in fig. 3. The computer device includes a processor, a memory, a network interface, and a database connected by a system bus. Wherein the computer is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, computer programs, and a database. The memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The database of the computer device is used for storing log data, user portraits, behavior baselines and the like. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program, when executed by a processor, may implement the method for establishing a group user behavior baseline of any one of the embodiments described above.
It will be appreciated by those skilled in the art that the architecture shown in fig. 3 is merely a block diagram of a portion of the architecture in connection with the present inventive arrangements and is not intended to limit the computer devices to which the present inventive arrangements are applicable.
The embodiment of the application also provides a computer readable storage medium, on which a computer program is stored, which when executed by a processor, implements the method for establishing the group user behavior baseline of any one of the above embodiments.
Those skilled in the art will appreciate that implementing all or part of the above described methods may be accomplished by hardware associated with a computer program stored on a non-transitory computer readable storage medium, which when executed, may comprise the steps of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium provided by the present application and used in embodiments may include non-volatile and/or volatile memory. The nonvolatile memory can include Read Only Memory (ROM), programmable ROM (PROM), electrically Programmable ROM (EPROM), electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), dual speed data rate SDRAM (SSRSDRAM), enhanced SDRAM (ESDRAM), synchronous link (SYNCHLINK) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), among others.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, apparatus, article, or method that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, apparatus, article, or method. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, apparatus, article, or method that comprises the element.
The foregoing description is only of the preferred embodiments of the present application and is not intended to limit the scope of the application, and all equivalent structures or equivalent processes using the descriptions and drawings of the present application or directly or indirectly applied to other related technical fields are included in the scope of the application.

Claims (7)

1. The method for establishing the group user behavior base line is characterized by comprising the following steps:
Acquiring a user portrait of each user and an individual behavior baseline corresponding to the user portrait, wherein the user portrait is a portrait constructed based on the specified information of the user and log history data corresponding to the user in a specified time period;
clustering calculation is carried out on all the user portraits to obtain user groups of different categories;
establishing corresponding group user behavior baselines based on individual behavior baselines of different users in the user group of the same category;
The method for acquiring the individual behavior baselines corresponding to the user images comprises the following steps:
acquiring log history data of the user and appointed information of the user;
obtaining dates corresponding to all pieces of data in the log historical data;
Classifying the data with the date being the working day to obtain working day log historical data, and classifying the data with the date being the holiday to obtain holiday log historical data;
Establishing a working day individual behavior baseline of the user according to the working day log historical data and the user specified information, and establishing a holiday individual behavior baseline of the user according to the holiday log historical data and the user specified information;
the step of establishing a corresponding group user behavior baseline based on individual behavior baselines of different users in the same class user group further comprises the following steps:
Removing abnormal data in individual baselines of different users in the user group of the same category by adopting an orphan forest algorithm;
establishing a group user behavior baseline by utilizing each individual behavior baseline after abnormal data are removed;
After the step of establishing the corresponding group user behavior base line based on the individual behavior base lines of different users in the same class user group, the method further comprises the following steps:
acquiring a current behavior log of a current period of a first user and a user portrait of the first user;
extracting a specified characteristic value of the current behavior log, wherein the specified characteristic value is a characteristic value required to be reflected in the group user behavior base line; and determining a user group category of the first user according to the user portrait of the first user;
Comparing the appointed characteristic value with a reference characteristic value corresponding to the appointed characteristic in a first group user behavior baseline, wherein the first group user behavior baseline is a group user behavior baseline corresponding to a user group category to which the first user belongs;
And if the comparison result meets the condition of triggering risk early warning, sending out alarm information.
2. The method for establishing a group user behavior baseline according to claim 1, wherein after the step of sending out the alarm information if the comparison result meets the condition for triggering the risk early warning, further comprises:
judging whether the appointed characteristic value reaches a preset abnormal data threshold value or not;
if not, marking the appointed characteristic on the individual behavior base line corresponding to the first user.
3. The method for establishing a group user behavior baseline according to claim 2, further comprising, after the step of labeling the specified feature on the individual behavior baseline corresponding to the first user:
judging whether the marked times of the features on the individual behavior baselines corresponding to the first user reach a preset quantity value or not;
If yes, reconstructing an individual behavior baseline corresponding to the first user.
4. The method of claim 1, wherein after the step of establishing the corresponding group user behavior baseline based on individual behavior baselines of different users in the same category of user group, further comprising:
And associating the users with the categories by using association rules.
5. A group user behavior baseline establishing apparatus for implementing a group user behavior baseline establishing method as defined in any one of claims 1 to 4, comprising:
An acquisition unit, configured to acquire a user portrait of each user, and an individual behavior baseline corresponding to the user portrait, where the user portrait is a portrait constructed based on specified information of the user and log history data corresponding to the user in a specified period of time;
the clustering unit is used for carrying out clustering calculation on all the user portraits to obtain user groups of different categories;
The establishing unit is used for establishing corresponding group user behavior baselines based on individual behavior baselines of different users in the user group of the same category.
6. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor implements the steps of the method of any of claims 1 to 4 when the computer program is executed.
7. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps of the method of any of claims 1 to 4.
CN202010621812.3A 2020-06-30 2020-06-30 Group user behavior baseline establishment method and device and computer equipment Active CN111737320B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010621812.3A CN111737320B (en) 2020-06-30 2020-06-30 Group user behavior baseline establishment method and device and computer equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010621812.3A CN111737320B (en) 2020-06-30 2020-06-30 Group user behavior baseline establishment method and device and computer equipment

Publications (2)

Publication Number Publication Date
CN111737320A CN111737320A (en) 2020-10-02
CN111737320B true CN111737320B (en) 2024-08-02

Family

ID=72652224

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010621812.3A Active CN111737320B (en) 2020-06-30 2020-06-30 Group user behavior baseline establishment method and device and computer equipment

Country Status (1)

Country Link
CN (1) CN111737320B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112579581B (en) * 2020-11-30 2023-04-14 贵州力创科技发展有限公司 Data access method and system of data analysis engine
CN114283917B (en) * 2021-11-25 2025-05-30 皖南医学院 A warning analysis method and system based on big data of chronic disease medication
CN114398966A (en) * 2021-12-31 2022-04-26 北京久安世纪科技有限公司 Early warning method for user portrait based on fortress machine
CN114925265B (en) * 2022-03-25 2025-01-28 上海聚均科技有限公司 Method, system, device and computer-readable storage medium for acquiring user portrait groups based on group behavior
CN114817377B (en) * 2022-06-29 2022-09-20 深圳红途科技有限公司 User portrait based data risk detection method, device, equipment and medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108021929A (en) * 2017-11-16 2018-05-11 华南理工大学 Mobile terminal electric business user based on big data, which draws a portrait, to establish and analysis method and system
CN108133390A (en) * 2017-12-22 2018-06-08 北京三快在线科技有限公司 For predicting the method and apparatus of user behavior and computing device

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2014172380A1 (en) * 2013-04-15 2014-10-23 Flextronics Ap, Llc Altered map routes based on user profile information
CN109086787B (en) * 2018-06-06 2023-07-25 平安科技(深圳)有限公司 User portrait acquisition method, device, computer equipment and storage medium
CN109740620B (en) * 2018-11-12 2023-09-26 平安科技(深圳)有限公司 Method, device, equipment and storage medium for establishing crowd figure classification model

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108021929A (en) * 2017-11-16 2018-05-11 华南理工大学 Mobile terminal electric business user based on big data, which draws a portrait, to establish and analysis method and system
CN108133390A (en) * 2017-12-22 2018-06-08 北京三快在线科技有限公司 For predicting the method and apparatus of user behavior and computing device

Also Published As

Publication number Publication date
CN111737320A (en) 2020-10-02

Similar Documents

Publication Publication Date Title
CN111737320B (en) Group user behavior baseline establishment method and device and computer equipment
WO2020253358A1 (en) Service data risk control analysis processing method, apparatus and computer device
CN109461078B (en) Abnormal transaction identification method and system based on fund transaction network
CN109767322B (en) Suspicious transaction analysis method and device based on big data and computer equipment
CN109858737B (en) Grading model adjustment method and device based on model deployment and computer equipment
CN109543096B (en) Data query method, device, computer equipment and storage medium
CN109767327A (en) Anti-money laundering-based customer information collection and its use
CN110738388B (en) Method, device, equipment and storage medium for evaluating risk conduction through association map
CN109949154B (en) Customer information classification method, apparatus, computer device and storage medium
CN108876133A (en) Risk assessment processing method, device, server and medium based on business information
CN109767326B (en) Suspicious transaction report generation method, device, computer equipment and storage medium
CN109543925B (en) Risk prediction method and device based on machine learning, computer equipment and storage medium
CN109523153A (en) Acquisition methods, device, computer equipment and the storage medium of illegal fund collection enterprise
CN108268624B (en) User data visualization method and system
CN111192153B (en) Crowd relation network construction method, device, computer equipment and storage medium
CN109886554B (en) Illegal behavior discrimination method, device, computer equipment and storage medium
CN110472114B (en) Abnormal data early warning method and device, computer equipment and storage medium
CN108280644B (en) Group membership data visualization method and system
CN112581283B (en) Method and device for analyzing and warning transaction behavior of commercial bank employees
CN111897587B (en) Internet of things application configuration method, device, computer equipment and storage medium
CN110729054B (en) Abnormal diagnosis behavior detection method and device, computer equipment and storage medium
CN111382944A (en) Job behavior risk identification method and device, computer equipment and storage medium
WO2016188334A1 (en) Method and device for processing application access data
CN114186760A (en) Analysis method and system for stable operation of enterprise and readable storage medium
CN112232556B (en) Product recommendation method and device, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant