[go: up one dir, main page]

CN106503863A - Based on the Forecasting Methodology of the age characteristicss of decision-tree model, system and terminal - Google Patents

Based on the Forecasting Methodology of the age characteristicss of decision-tree model, system and terminal Download PDF

Info

Publication number
CN106503863A
CN106503863A CN201610989789.7A CN201610989789A CN106503863A CN 106503863 A CN106503863 A CN 106503863A CN 201610989789 A CN201610989789 A CN 201610989789A CN 106503863 A CN106503863 A CN 106503863A
Authority
CN
China
Prior art keywords
model parameter
training
decision
age
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201610989789.7A
Other languages
Chinese (zh)
Inventor
曹杰
冯雨晖
宿晓坤
杨睿
李学超
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
BEIJING HONGMA MEDIA CULTURE DEVELOPMENT CO LTD
Original Assignee
BEIJING HONGMA MEDIA CULTURE DEVELOPMENT CO LTD
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by BEIJING HONGMA MEDIA CULTURE DEVELOPMENT CO LTD filed Critical BEIJING HONGMA MEDIA CULTURE DEVELOPMENT CO LTD
Priority to CN201610989789.7A priority Critical patent/CN106503863A/en
Publication of CN106503863A publication Critical patent/CN106503863A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Strategic Management (AREA)
  • Theoretical Computer Science (AREA)
  • Development Economics (AREA)
  • Physics & Mathematics (AREA)
  • Accounting & Taxation (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Finance (AREA)
  • General Physics & Mathematics (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • General Business, Economics & Management (AREA)
  • Game Theory and Decision Science (AREA)
  • Data Mining & Analysis (AREA)
  • Human Resources & Organizations (AREA)
  • Databases & Information Systems (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Tourism & Hospitality (AREA)
  • General Engineering & Computer Science (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The present invention provides a kind of Forecasting Methodology of the age characteristicss based on decision-tree model, system and terminal.The method includes:Collection basic data information;The feature input variable and target variable in the basic data information attribute is extracted, sample data is obtained;Sample data is divided into training set and test set, training set is input in decision-tree model carries out model parameter training, by model parameter training result, is applied to test set, test meets the model parameter training result of self-defined stability condition;The model parameter training result for meeting self-defined stability condition is exported;The age that the rule of the model parameter training result of output is regularly updated as unknown subscriber is predicted the outcome.A kind of Forecasting Methodology of age characteristicss based on decision-tree model that the present invention is provided, system and terminal, forecast model is built, the age of user is predicted, user's portrait is accurately built, solid data basis are laid for scenes such as marketing, the accuracy of age identification is improve.

Description

Based on the Forecasting Methodology of the age characteristicss of decision-tree model, system and terminal
Technical field
A kind of the present invention relates to e-commerce field, more particularly to prediction side of the age characteristicss based on decision-tree model Method, system and terminal.
Background technology
In the relevant industries based on user such as electric business or social activity, the real age for understanding user is generally required, To classify to which, its behavior characteristics and preference is conveniently studied.But network is used as a virtual world, user when enjoying a trip to, Itself there is a kind of vigilance, so as to hide oneself part true identity.
But the colony of all ages and classes, the behavior characteristicss of itself have distinction, its real age can pass through itself row It is characterized and shows, targetedly behavior characteristicss data is processed and feature extraction, year can be greatly improved The accuracy of age prediction.
In prior art, age prediction is carried out using regression model.Inventor has found in the course of the study, using recurrence When model carries out age prediction, as the age is a continuous variable in itself, the result for predicting not is that very accurately reason is such as Under:
1st, the user of age-grade value, often due to residing living environment affects, causes its external behavior characteristicss difference Larger;
2nd, the user of all ages and classes value, the user being especially of the similar age, behavior characteristicss often do not have significantly differentiation Property;
Ultimately result in, the error of forecast of regression model real age is larger.
Content of the invention
Present invention is primarily targeted at provide a kind of Forecasting Methodology of the age characteristicss based on decision-tree model, system and Terminal, to overcome the age characteristicss of existing e-commerce field to predict difficult technical problem.
One aspect of the present invention provides a kind of Forecasting Methodology of the age characteristicss based on decision-tree model, including:
Collection basic data information;
The feature input variable and target variable in the basic data information attribute is extracted, sample data is obtained;
Sample data is divided into training set and test set, training set is input in decision-tree model carries out model parameter instruction Practice, by model parameter training result, be applied to test set, test meets the model parameter training knot of self-defined stability condition Really;
The model parameter training result for meeting self-defined stability condition is exported;Model parameter training result by output Age for regularly updating as unknown subscriber of rule predict the outcome.
Further, the collection basic data information, including but not limited to:Collection log-on message, access behavioral data, Place an order the basic data of behavioral data and/or artist.
Further, the feature input variable and target variable extracted in the basic data information attribute, obtains Sample data, including:
Obtain all properties information in basic data information;
Related to age prediction at least one input variable in the attribute information and at least one target variable is extracted, And at least one input variable and at least one target variable arrangement are obtained sample data.
Further, described sample data is divided into training set and test set, training set is input in decision-tree model Model parameter training is carried out, by model parameter training result, test set is applied to, test meets the mould of self-defined stability condition Shape parameter training result, including:
Sample data is divided into the training set for modeling and the test set for verifying modelling effect;
Training set is input in decision-tree model carries out model parameter training;
By model parameter training result, test set is applied to, test meets test set accuracy rate in training set accuracy rate The model parameter training result of the stability condition within self-defined middle percentage ratio.
Further, training set is input in decision-tree model carries out model parameter training, including:
The feature input variable of training set is input in decision-tree model, decision-tree model is carried out based on information gain-ratio Variable selection and segmentation point selection, carry out model parameter training.
Further, the model parameter training result for meeting self-defined stability condition is exported;Model ginseng by output The age that the rule of number training results is regularly updated as unknown subscriber predicts the outcome, including:
The mould of stability condition of the test set accuracy rate within the self-defined middle percentage ratio of training set accuracy rate will be met Shape parameter training result is exported;The where conditions that the rule of the model parameter training result of output is organized into SQL, are deployed to Regularly updating the age for unknown subscriber in system predicts the outcome.
Another aspect of the present invention additionally provides a kind of prognoses system of the age characteristicss based on decision-tree model, including:
Acquisition module, for gathering basic data information;
Abstraction module, for extracting feature input variable and target variable in the basic data information attribute, obtains Sample data;
MBM, for sample data is divided into training set and test set, training set is input in decision-tree model Model parameter training is carried out, by model parameter training result, test set is applied to, test meets the mould of self-defined stability condition Shape parameter training result;
Output module, for exporting the model parameter training result for meeting self-defined stability condition;Mould by output The rule of shape parameter training result regularly updates the age for unknown subscriber and predicts the outcome.
Further, the acquisition module, including but not limited to:
Collecting unit, for the basic number for gathering log-on message, access behavioral data, place an order behavioral data and/or artist According to.
Further, the abstraction module, including:
Acquiring unit, for obtaining all properties information in basic data information;
Extracting unit, for extracting related to age prediction at least one input variable in the attribute information and at least One target variable, and at least one input variable and at least one target variable arrangement are obtained sample data.
Further, the MBM, including:
Training unit, for being divided into the training set for modeling and the test for verifying modelling effect by sample data Collection;Training set is input in decision-tree model carries out model parameter training;
Modeling unit, for by model parameter training result, being applied to test set, test meets test set accuracy rate in instruction The model parameter training result of the stability condition within the self-defined middle percentage ratio of white silk collection accuracy rate.
Further, training unit, including:
Training subelement, is input in decision-tree model for the feature input variable by training set, decision-tree model base Variable selection and segmentation point selection are carried out in information gain-ratio, model parameter training is carried out.
Further, output module, including:
Output unit is steady within the self-defined middle percentage ratio of training set accuracy rate for will meet test set accuracy rate The model parameter training result output of qualitative condition;
Updating block, the rule for the model parameter training result by output are organized into the where conditions of SQL, deployment Regularly updating the age for unknown subscriber in system predicts the outcome.
Another aspect of the present invention additionally provides a kind of prediction terminal of the age characteristicss based on decision-tree model, including aforementioned System described in any one.
The present invention is by gathering basic data information;Extract feature input variable in the basic data information attribute and Target variable, obtains sample data;Sample data is divided into training set and test set, training set is input in decision-tree model Model parameter training is carried out, by model parameter training result, test set is applied to, test meets the mould of self-defined stability condition Shape parameter training result;The model parameter training result for meeting self-defined stability condition is exported;Model parameter by output The rule of training result regularly updates the age for unknown subscriber and predicts the outcome, and using the basic data information of collection, builds pre- Model is surveyed, the age of user is predicted, user's portrait is accurately built, is that the scenes such as marketing lay solid data basis, is improved The accuracy of age identification.
Description of the drawings
Fig. 1 is according to a kind of embodiment one of the Forecasting Methodology of the age characteristicss based on decision-tree model according to the present invention Flow chart;
Fig. 2 is according to a kind of embodiment two of the prognoses system of the age characteristicss based on decision-tree model according to the present invention Structured flowchart;
Fig. 3 is according to a kind of acquisition module of the prognoses system of the age characteristicss based on decision-tree model according to the present invention The structured flowchart of embodiment two;
Fig. 4 is according to a kind of abstraction module of the prognoses system of the age characteristicss based on decision-tree model according to the present invention The structured flowchart of embodiment two;
Fig. 5 is according to a kind of MBM of the prognoses system of the age characteristicss based on decision-tree model according to the present invention The structured flowchart of embodiment two;
Fig. 6 is according to a kind of training unit of the prognoses system of the age characteristicss based on decision-tree model according to the present invention The structured flowchart of embodiment two;
Fig. 7 is according to a kind of output module of the prognoses system of the age characteristicss based on decision-tree model according to the present invention The structured flowchart of embodiment two;
Fig. 8 is according to a kind of embodiment three of the prediction terminal of the age characteristicss based on decision-tree model according to the present invention Structured flowchart.
Specific embodiment
In order that those skilled in the art more fully understand the present invention program, below in conjunction with the embodiment of the present invention in Accompanying drawing, to the embodiment of the present invention in technical scheme be clearly and completely described, it is clear that described embodiment is only The embodiment of a part of the invention, rather than whole embodiments.Embodiment in based on the present invention, ordinary skill people The every other embodiment obtained under the premise of creative work is not made by member, should all belong to the model of present invention protection Enclose.
It should be noted that description and claims of this specification and the term " first " in above-mentioned accompanying drawing, " Two " it is etc. for distinguishing similar object, without for describing specific order or precedence.It should be appreciated that so using Data can exchange in the appropriate case, so as to embodiments of the invention described herein can with except here diagram or Order beyond those of description is implemented.Additionally, term " comprising " and " having " and their any deformation, it is intended that cover Lid is non-exclusive to be included, and for example, the process, method, system, product or the equipment that contain series of steps or unit are not necessarily limited to Those steps that clearly lists or unit, but may include clearly not list or for these processes, method, product Or intrinsic other steps of equipment or unit.
Decision tree (Decision Tree) is on the basis of known various situation probability of happening, by constituting decision tree To ask for probability of the expected value of net present value (NPV) more than or equal to zero, assessment item risk judges the method for decision analysis of its feasibility, It is a kind of diagram method intuitively with probability analyses.As this decision branch is drawn as branch of the figure like one tree, therefore claim Decision tree.In machine learning, decision tree is a forecast model, and what he represented is the one kind between object properties and object value Mapping relations.The clutter of Entropy=systems, using algorithm ID3, C4.5 and C5.0 spanning tree algorithms use entropy.This Tolerance is the concept based on entropy in information theory.
Decision tree is a kind of tree structure, and wherein each internal node represents the test on an attribute, each branch's generation One test output of table, each leaf node represent a kind of classification.
Age of user in conjunction with the business characteristic of industry itself, is cleverly layered by the present invention by decision-tree model, Continuous variable is converted to discrete variable, regression problem is converted into classification problem, targetedly data are processed and Feature extraction, modeling, finally both ensure that operational availability, had improved the accuracy rate of prediction again.
Embodiment one
As shown in figure 1, one aspect of the present invention provides a kind of Forecasting Methodology of the age characteristicss based on decision-tree model, bag Include:Step S110, step S120, step S130 and step S140.
In step s 110, basic data information is gathered.
The collection basic data information, including but not limited to:Collection log-on message, access behavioral data, lower single act Data and/or the basic data of artist.
In the step s 120, the feature input variable and target variable in the basic data information attribute is extracted, is obtained Sample data.
Including:
Obtain all properties information in basic data information;Extract related extremely to age prediction in the attribute information A few input variable and at least one target variable, and at least one input variable and at least one target are become Amount arrangement obtains sample data.
Preferably, related eight to age prediction in the sequence information of extraction user, the Back ground Information attribute information of artist Input variable:
Whether √ user bought parent-offspring's intermediate item;
√ user's history highest level of consumption;
√ user preference artist's numbers;
Whether preference is combined √ user;
√ user preference artist's sexes;
√ user preferences artist development state (such as deer break, Chinese, development state are Korea);
√ user preference artist's ages;
√ user preferences artist development ground;
One target variable:Age of user is layered (trinary variable [student's phase, rising stage, stable phase]):
Student's phase:Age<=22;
Rising stage:22<Age<=30;
Stable phase:Age>30;
In step s 130, sample data is divided into training set and test set, training set is input in decision-tree model Model parameter training is carried out, by model parameter training result, test set is applied to, test meets the mould of self-defined stability condition Shape parameter training result.
Including:
Sample data is divided into the training set for modeling and the test set for verifying modelling effect;Wherein, sample number According to sample data when being for modeling, if there is 1,000,000 sample datas, need to be classified as 2 parts, a part be for Modeling, a part is for verifying modelling effect, e.g., training set:Test set=6:4 and 6:4 ratio is to adjust , but in general, training set accounting can be relatively higher, and conventional division proportion is 5:5,6:4,7:3,75:25 etc..
Training set is input in decision-tree model carries out model parameter training;By model parameter training result, it is applied to Test set, test meet the mould of stability condition of the test set accuracy rate within the self-defined middle percentage ratio of training set accuracy rate Shape parameter training result.
By model parameter training result, test set is applied to, checks the stability of training result.That is, overall accuracy rate and Training result is consistent, and specifically consistent standard can be weighed according to the acceptable degree of concrete scene, be referenced as:Test Collection accuracy rate is within ± the 10% of training set accuracy rate.
Wherein, training set is input in decision-tree model carries out model parameter training, including:Will be defeated for the feature of training set Enter variable to be input in decision-tree model, decision-tree model carries out variable selection and segmentation point selection based on information gain-ratio, enters Row model parameter is trained.
In step S140, the model parameter training result for meeting self-defined stability condition is exported;Mould by output The rule of shape parameter training result regularly updates the age for unknown subscriber and predicts the outcome.
Including:
The mould of stability condition of the test set accuracy rate within the self-defined middle percentage ratio of training set accuracy rate will be met Shape parameter training result is exported;The where conditions that the rule of the model parameter training result of output is organized into SQL, are deployed to Regularly updating the age for unknown subscriber in system predicts the outcome.
One concrete application example, for the user A at unknown age, it is assumed that meet following conditions:User preference artist is born Age>Nineteen ninety-five and user preference artist is born the age<1998, then judge that user A is student's phase.
One practical application example, decision-tree model have many algorithms, but in system solidification can directly invoke correlation Algorithm bag is trained, as long as adjusting relevant parameter, meets actual requirement.
The embodiment of the present invention one is by gathering basic data information;The feature extracted in the basic data information attribute is defeated Enter variable and target variable, obtain sample data;Sample data is divided into training set and test set, training set is input to decision-making Model parameter training is carried out in tree-model, by model parameter training result, test set is applied to, and test meets self-defined stability The model parameter training result of condition;The model parameter training result for meeting self-defined stability condition is exported;By output The rule of model parameter training result regularly updates the age for unknown subscriber and predicts the outcome, using the basic data letter of collection Breath, builds forecast model, predicts the age of user, accurately builds user's portrait, is that the scenes such as marketing lay solid data Basis, improves the accuracy of age identification.
Embodiment two
As shown in Fig. 2 another aspect of the present invention additionally provides a kind of prediction system of the age characteristicss based on decision-tree model System 200, including:
Acquisition module 21, for gathering basic data information.
Abstraction module 22, for extracting feature input variable and target variable in the basic data information attribute, obtains Arrive sample data.
Training set, for sample data is divided into training set and test set, is input to decision-tree model by MBM 23 In carry out model parameter training, by model parameter training result, be applied to test set, test meets self-defined stability condition Model parameter training result.
Output module 24, for exporting the model parameter training result for meeting self-defined stability condition;By output The rule of model parameter training result regularly updates the age for unknown subscriber and predicts the outcome.
Further, as shown in figure 3, the acquisition module 21, including but not limited to:
Collecting unit 211, for the basis for gathering log-on message, access behavioral data, place an order behavioral data and/or artist Data.
Further, as shown in figure 4, the abstraction module 22, including:
Acquiring unit 221, for obtaining all properties information in basic data information;
Extracting unit 222, for extract related to age prediction at least one input variable in the attribute information and At least one target variable, and at least one input variable and at least one target variable arrangement are obtained sample number According to.
Further, as shown in figure 5, the MBM 23, including:
Training unit 231, for being divided into the training set for modeling and the survey for verifying modelling effect by sample data Examination collection;Training set is input in decision-tree model carries out model parameter training;
Modeling unit 232, for by model parameter training result, being applied to test set, test meets test set accuracy rate The model parameter training result of the stability condition within the self-defined middle percentage ratio of training set accuracy rate.
Further, as shown in fig. 6, training unit 231, including:
Training subelement 2311, is input in decision-tree model for the feature input variable by training set, decision tree mould Type carries out variable selection and segmentation point selection based on information gain-ratio, carries out model parameter training.
Further, as shown in fig. 7, output module 24, including:
Output unit 241, for will meet test set accuracy rate within the self-defined middle percentage ratio of training set accuracy rate Stability condition model parameter training result output;
Updating block 242, the rule for the model parameter training result by output are organized into the where conditions of SQL, portion Affixing one's name to the age regularly updated in system as unknown subscriber predicts the outcome.
The embodiment of the present invention two gathers basic data information by acquisition module;The basic number is extracted by abstraction module Feature input variable and target variable according to information attribute, obtains sample data;Sample data is divided into by MBM Training set and test set, training set is input in decision-tree model carries out model parameter training, by model parameter training result, Test set is applied to, test meets the model parameter training result of self-defined stability condition;Output module will meet self-defined The model parameter training result output of stability condition;The rule of the model parameter training result of output is regularly updated as unknown The age of user predicts the outcome, and using the basic data information of collection, builds forecast model, predicts the age of user, accurately User's portrait is built, is that the scenes such as marketing lay solid data basis, is improve the accuracy of age identification.
Embodiment three
As shown in figure 8, another aspect of the present invention additionally provides a kind of prediction end of the age characteristicss based on decision-tree model End 300, including the system 200 described in two any one of embodiment.
The embodiment of the present invention three is by gathering basic data information;The feature extracted in the basic data information attribute is defeated Enter variable and target variable, obtain sample data;Sample data is divided into training set and test set, training set is input to decision-making Model parameter training is carried out in tree-model, by model parameter training result, test set is applied to, and test meets self-defined stability The model parameter training result of condition;The model parameter training result for meeting self-defined stability condition is exported;By output The rule of model parameter training result regularly updates the age for unknown subscriber and predicts the outcome, using the basic data letter of collection Breath, builds forecast model, predicts the age of user, accurately builds user's portrait, is that the scenes such as marketing lay solid data Basis, improves the accuracy of age identification.
The embodiments of the present invention are for illustration only, do not represent the quality of embodiment.
It should be noted that for aforesaid each method embodiment, in order to be briefly described, therefore which is all expressed as a series of Combination of actions, but those skilled in the art should know, the present invention do not limited by described sequence of movement because According to the present invention, some steps can be carried out using other orders or simultaneously.Secondly, those skilled in the art should also know Know, embodiment described in this description belongs to preferred embodiment, involved action and module are not necessarily of the invention Necessary.
In the above-described embodiments, the description of each embodiment is all emphasized particularly on different fields, in certain embodiment, there is no the portion that describes in detail Point, may refer to the associated description of other embodiment.
In several embodiments provided herein, it should be understood that disclosed device, can be by another way Realize.For example, device embodiment described above is only the schematically division of for example described unit, is only one kind Division of logic function, can have when actually realizing other dividing mode, for example multiple units or component can in conjunction with or can To be integrated into another system, or some features can be ignored, or not execute.Another, shown or discussed each other Coupling or direct-coupling or communication connection can be INDIRECT COUPLING or communication connection by some interfaces, device or unit, Can be electrical or other forms.
The unit that illustrates as separating component can be or may not be physically separate, aobvious as unit The part for showing can be or may not be physical location, you can be located at a place, or can also be distributed to multiple On NE.Some or all of unit therein can be selected according to the actual needs to realize the mesh of this embodiment scheme 's.
In addition, each functional unit in each embodiment of the invention can be integrated in a processing unit, it is also possible to It is that unit is individually physically present, it is also possible to which two or more units are integrated in a unit.Above-mentioned integrated list Unit both can be realized in the form of hardware, it would however also be possible to employ the form of SFU software functional unit is realized.
It may be noted that according to the needs that implements, each step/part described in this application can be split as more multistep The part operation of two or more step/parts or step/part also can be combined into new step/part by suddenly/part, To realize the purpose of the present invention.
Above-mentioned the method according to the invention can be realized in hardware, firmware, or is implemented as being storable in recording medium Software or computer code in (such as CD ROM, RAM, floppy disk, hard disk or magneto-optic disk), or it is implemented through network download Original storage in long-range recording medium or nonvolatile machine readable media and the meter in local recording medium will be stored in Calculation machine code, so as to method described here can be stored in using general purpose computer, application specific processor or programmable or special With the such software processes in the recording medium of hardware (such as ASIC or FPGA).It is appreciated that computer, processor, micro- Processor controller or programmable hardware include can storing or receive software or computer code storage assembly (for example, RAM, ROM, flash memory etc.), when the software or computer code by computer, processor or hardware access and execute when, realize here The processing method of description.Additionally, when general purpose computer accesses the code of the process being shown in which for realization, the execution of code The special-purpose computer that general purpose computer is converted to the process being shown in which for execution.
The above, the only specific embodiment of the present invention, but protection scope of the present invention is not limited thereto, any Those familiar with the art the invention discloses technical scope in, change or replacement can be readily occurred in, should all be contained Cover within protection scope of the present invention.Therefore, protection scope of the present invention should be defined by the scope of the claims.

Claims (13)

1. a kind of Forecasting Methodology of the age characteristicss based on decision-tree model, it is characterised in that include:
Collection basic data information;
The feature input variable and target variable in the basic data information attribute is extracted, sample data is obtained;
Sample data is divided into training set and test set, training set is input in decision-tree model carries out model parameter training, By model parameter training result, test set is applied to, test meets the model parameter training result of self-defined stability condition;
The model parameter training result for meeting self-defined stability condition is exported;Rule by the model parameter training result of output Then regularly updating the age for unknown subscriber predicts the outcome.
2. the method for claim 1, it is characterised in that the collection basic data information, including but not limited to:Collection Log-on message, the basic data for accessing behavioral data, placing an order behavioral data and/or artist.
3. method as claimed in claim 1 or 2, it is characterised in that the spy in the extraction basic data information attribute Input variable and target variable is levied, sample data is obtained, including:
Obtain all properties information in basic data information;
Related to age prediction at least one input variable in the attribute information and at least one target variable is extracted, and will At least one input variable and at least one target variable are arranged and obtain sample data.
4. the method as described in one of claim 1-3, it is characterised in that described sample data is divided into training set and test Collection, training set is input in decision-tree model carries out model parameter training, by model parameter training result, is applied to test Collection, test meet the model parameter training result of self-defined stability condition, including:
Sample data is divided into the training set for modeling and the test set for verifying modelling effect;
Training set is input in decision-tree model carries out model parameter training;
By model parameter training result, test set is applied to, test meets test set accuracy rate making by oneself in training set accuracy rate The model parameter training result of the stability condition in justice within percentage ratio.
5. method as claimed in claim 4, it is characterised in that training set is input in decision-tree model carries out model parameter Training, including:
The feature input variable of training set is input in decision-tree model, decision-tree model carries out variable based on information gain-ratio Select and segmentation point selection, carry out model parameter training.
6. the method as described in one of claim 1-4, it is characterised in that the model parameter of self-defined stability condition will be met Training result is exported;The age that the rule of the model parameter training result of output is regularly updated as unknown subscriber is predicted the outcome, Including:
The model for meeting stability condition of the test set accuracy rate within the self-defined middle percentage ratio of training set accuracy rate is joined Number training result output;The where conditions that the rule of the model parameter training result of output is organized into SQL, are deployed to system In age for regularly updating as unknown subscriber predict the outcome.
7. a kind of prognoses system of the age characteristicss based on decision-tree model, it is characterised in that include:
Acquisition module, for gathering basic data information;
Abstraction module, for extracting feature input variable and target variable in the basic data information attribute, obtains sample Data;
MBM, for sample data is divided into training set and test set, training set is input in decision-tree model to be carried out Model parameter is trained, and by model parameter training result, is applied to test set, and test meets the model ginseng of self-defined stability condition Number training result;
Output module, for exporting the model parameter training result for meeting self-defined stability condition;Model ginseng by output The age that the rule of number training results is regularly updated as unknown subscriber predicts the outcome.
8. system as claimed in claim 7, it is characterised in that the acquisition module, including but not limited to:
Collecting unit, for the basic data for gathering log-on message, access behavioral data, place an order behavioral data and/or artist.
9. system as claimed in claim 7 or 8, it is characterised in that the abstraction module, including:
Acquiring unit, for obtaining all properties information in basic data information;
Extracting unit, for extracting related to age prediction at least one input variable and at least one in the attribute information Target variable, and at least one input variable and at least one target variable arrangement are obtained sample data.
10. system as claimed in claim 7, it is characterised in that the MBM, including:
Training unit, for being divided into the training set for modeling and the test set for verifying modelling effect by sample data;Will Training set is input in decision-tree model carries out model parameter training;
Modeling unit, for by model parameter training result, being applied to test set, test meets test set accuracy rate in training set The model parameter training result of the stability condition within the self-defined middle percentage ratio of accuracy rate.
11. systems as claimed in claim 10, it is characterised in that training unit, including:
Training subelement, is input in decision-tree model for the feature input variable by training set, and decision-tree model is based on letter Breath ratio of profit increase carries out variable selection and segmentation point selection, carries out model parameter training.
12. systems as claimed in claim 7, it is characterised in that output module, including:
Output unit, for will meet stability of the test set accuracy rate within the self-defined middle percentage ratio of training set accuracy rate The model parameter training result output of condition;
Updating block, the rule for the model parameter training result by output are organized into the where conditions of SQL, are deployed to and are Regularly updating the age for unknown subscriber in system predicts the outcome.
A kind of 13. prediction terminals of the age characteristicss based on decision-tree model, including as described in any one of claim 7-12 System.
CN201610989789.7A 2016-11-10 2016-11-10 Based on the Forecasting Methodology of the age characteristicss of decision-tree model, system and terminal Pending CN106503863A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610989789.7A CN106503863A (en) 2016-11-10 2016-11-10 Based on the Forecasting Methodology of the age characteristicss of decision-tree model, system and terminal

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610989789.7A CN106503863A (en) 2016-11-10 2016-11-10 Based on the Forecasting Methodology of the age characteristicss of decision-tree model, system and terminal

Publications (1)

Publication Number Publication Date
CN106503863A true CN106503863A (en) 2017-03-15

Family

ID=58323996

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610989789.7A Pending CN106503863A (en) 2016-11-10 2016-11-10 Based on the Forecasting Methodology of the age characteristicss of decision-tree model, system and terminal

Country Status (1)

Country Link
CN (1) CN106503863A (en)

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106951692A (en) * 2017-03-06 2017-07-14 北京师范大学 Exemplary resource elements recognition method and device
CN107273883A (en) * 2017-05-03 2017-10-20 天方创新(北京)信息技术有限公司 Decision-tree model training method, determine data attribute method and device in OCR result
CN107316204A (en) * 2017-05-27 2017-11-03 银联智惠信息服务(上海)有限公司 Recognize humanized method, device, computer-readable medium and the system of holding
CN108549954A (en) * 2018-03-26 2018-09-18 平安科技(深圳)有限公司 Risk model training method, risk identification method, device, equipment and medium
CN109376932A (en) * 2018-10-30 2019-02-22 平安医疗健康管理股份有限公司 Age prediction method, device, server and storage medium based on prediction model
CN109508558A (en) * 2018-10-31 2019-03-22 阿里巴巴集团控股有限公司 A kind of verification method and device of data validity
CN109657482A (en) * 2018-10-26 2019-04-19 阿里巴巴集团控股有限公司 A kind of verification method and device of data validity
CN110415020A (en) * 2019-07-01 2019-11-05 北京三快在线科技有限公司 Age prediction method, device and electronic equipment
WO2019218751A1 (en) * 2018-05-16 2019-11-21 阿里巴巴集团控股有限公司 Processing method, apparatus and device for risk prediction of insurance service
CN111279304A (en) * 2017-09-29 2020-06-12 甲骨文国际公司 Method and system for configuring a communication decision tree based on connected locatable elements on a canvas
CN111325372A (en) * 2018-12-13 2020-06-23 北京京东尚科信息技术有限公司 Prediction model establishment method, prediction method, device, medium and equipment
CN111340276A (en) * 2020-02-19 2020-06-26 联想(北京)有限公司 Method and system for generating prediction data
CN112712900A (en) * 2021-01-08 2021-04-27 昆山杜克大学 Physiological age prediction model based on machine learning and establishment method thereof
CN113838578A (en) * 2021-09-27 2021-12-24 南方医科大学珠江医院 Big data-based trachea foreign matter emergency treatment device with striking hammer
CN113947417A (en) * 2020-07-15 2022-01-18 上海哔哩哔哩科技有限公司 Training method and device of age identification model and age identification method and device
US11775843B2 (en) 2017-09-29 2023-10-03 Oracle International Corporation Directed trajectories through communication decision tree using iterative artificial intelligence

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100312724A1 (en) * 2007-11-02 2010-12-09 Thomas Pinckney Inferring user preferences from an internet based social interactive construct
WO2011019731A2 (en) * 2009-08-10 2011-02-17 Mintigo Ltd. Systems and methods for gererating leads in a network by predicting properties of external nodes
CN102859967A (en) * 2010-03-01 2013-01-02 诺基亚公司 Method and apparatus for estimating user characteristics based on user interaction data
CN103886074A (en) * 2014-03-24 2014-06-25 江苏名通信息科技有限公司 Commodity recommendation system based on social media
CN103927675A (en) * 2014-04-18 2014-07-16 北京京东尚科信息技术有限公司 Method and device for judging age brackets of users
CN104598607A (en) * 2015-01-29 2015-05-06 百度在线网络技术(北京)有限公司 Method and system for recommending search phrase
CN106022800A (en) * 2016-05-16 2016-10-12 北京百分点信息科技有限公司 User feature data processing method and device

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100312724A1 (en) * 2007-11-02 2010-12-09 Thomas Pinckney Inferring user preferences from an internet based social interactive construct
WO2011019731A2 (en) * 2009-08-10 2011-02-17 Mintigo Ltd. Systems and methods for gererating leads in a network by predicting properties of external nodes
CN102859967A (en) * 2010-03-01 2013-01-02 诺基亚公司 Method and apparatus for estimating user characteristics based on user interaction data
CN103886074A (en) * 2014-03-24 2014-06-25 江苏名通信息科技有限公司 Commodity recommendation system based on social media
CN103927675A (en) * 2014-04-18 2014-07-16 北京京东尚科信息技术有限公司 Method and device for judging age brackets of users
CN104598607A (en) * 2015-01-29 2015-05-06 百度在线网络技术(北京)有限公司 Method and system for recommending search phrase
CN106022800A (en) * 2016-05-16 2016-10-12 北京百分点信息科技有限公司 User feature data processing method and device

Cited By (24)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106951692A (en) * 2017-03-06 2017-07-14 北京师范大学 Exemplary resource elements recognition method and device
CN107273883B (en) * 2017-05-03 2020-04-21 天方创新(北京)信息技术有限公司 Decision tree model training method, and method and device for determining data attributes in OCR (optical character recognition) result
CN107273883A (en) * 2017-05-03 2017-10-20 天方创新(北京)信息技术有限公司 Decision-tree model training method, determine data attribute method and device in OCR result
CN107316204A (en) * 2017-05-27 2017-11-03 银联智惠信息服务(上海)有限公司 Recognize humanized method, device, computer-readable medium and the system of holding
CN111279304B (en) * 2017-09-29 2023-08-15 甲骨文国际公司 Method and system for configuring communication decision tree based on locatable elements connected on canvas
US11900267B2 (en) 2017-09-29 2024-02-13 Oracle International Corporation Methods and systems for configuring communication decision trees based on connected positionable elements on canvas
CN111279304A (en) * 2017-09-29 2020-06-12 甲骨文国际公司 Method and system for configuring a communication decision tree based on connected locatable elements on a canvas
US11775843B2 (en) 2017-09-29 2023-10-03 Oracle International Corporation Directed trajectories through communication decision tree using iterative artificial intelligence
CN108549954A (en) * 2018-03-26 2018-09-18 平安科技(深圳)有限公司 Risk model training method, risk identification method, device, equipment and medium
CN108549954B (en) * 2018-03-26 2022-08-02 平安科技(深圳)有限公司 Risk model training method, risk identification device, risk identification equipment and risk identification medium
WO2019218751A1 (en) * 2018-05-16 2019-11-21 阿里巴巴集团控股有限公司 Processing method, apparatus and device for risk prediction of insurance service
CN109657482A (en) * 2018-10-26 2019-04-19 阿里巴巴集团控股有限公司 A kind of verification method and device of data validity
CN109657482B (en) * 2018-10-26 2022-11-18 创新先进技术有限公司 Data validity verification method, device and equipment
CN109376932A (en) * 2018-10-30 2019-02-22 平安医疗健康管理股份有限公司 Age prediction method, device, server and storage medium based on prediction model
CN109508558B (en) * 2018-10-31 2022-11-18 创新先进技术有限公司 Data validity verification method, device and equipment
CN109508558A (en) * 2018-10-31 2019-03-22 阿里巴巴集团控股有限公司 A kind of verification method and device of data validity
CN111325372A (en) * 2018-12-13 2020-06-23 北京京东尚科信息技术有限公司 Prediction model establishment method, prediction method, device, medium and equipment
CN111325372B (en) * 2018-12-13 2025-04-18 北京京东尚科信息技术有限公司 Method for establishing prediction model, prediction method, device, medium and equipment
CN110415020A (en) * 2019-07-01 2019-11-05 北京三快在线科技有限公司 Age prediction method, device and electronic equipment
CN111340276B (en) * 2020-02-19 2022-08-19 联想(北京)有限公司 Method and system for generating prediction data
CN111340276A (en) * 2020-02-19 2020-06-26 联想(北京)有限公司 Method and system for generating prediction data
CN113947417A (en) * 2020-07-15 2022-01-18 上海哔哩哔哩科技有限公司 Training method and device of age identification model and age identification method and device
CN112712900A (en) * 2021-01-08 2021-04-27 昆山杜克大学 Physiological age prediction model based on machine learning and establishment method thereof
CN113838578A (en) * 2021-09-27 2021-12-24 南方医科大学珠江医院 Big data-based trachea foreign matter emergency treatment device with striking hammer

Similar Documents

Publication Publication Date Title
CN106503863A (en) Based on the Forecasting Methodology of the age characteristicss of decision-tree model, system and terminal
CN111181939B (en) A network intrusion detection method and device based on ensemble learning
CN116228021A (en) Mine ecological restoration evaluation analysis method and system based on environment monitoring
CN109063984B (en) Method, apparatus, computer device and storage medium for risky travelers
CN109583468A (en) Training sample acquisition methods, sample predictions method and corresponding intrument
CN108833139B (en) An OSSEC Alarm Data Aggregation Method Based on Category Attribute Division
CN107169768A (en) The acquisition methods and device of abnormal transaction data
CN109919781A (en) Case recognition methods, electronic device and computer readable storage medium are cheated by clique
CN110008259A (en) The method and terminal device of visualized data analysis
CN106843941B (en) Information processing method and device and computer equipment
CN112437053B (en) Intrusion detection method and device
CN112329816A (en) Data classification method and device, electronic equipment and readable storage medium
CN112818162B (en) Image retrieval method, device, storage medium and electronic equipment
CN105574544A (en) Data processing method and device
CN113852204A (en) Three-dimensional panoramic monitoring system and method for transformer substation with digital twin
CN109886554A (en) Unlawful practice method of discrimination, device, computer equipment and storage medium
CN117156442A (en) Cloud data security protection method and system based on 5G network
CN112801231A (en) Decision model training method and device for business object classification
CN114187036B (en) Internet advertisement intelligent recommendation management system based on behavior characteristic recognition
CN113726558A (en) Network equipment flow prediction system based on random forest algorithm
CN109995611B (en) Traffic classification model establishing and traffic classification method, device, equipment and server
Ayhan et al. Analysis of image classification methods for remote sensing
CN111210158B (en) Target address determining method, device, computer equipment and storage medium
CN115392582B (en) Crop yield prediction method based on incremental fuzzy rough set attribute reduction
CN112580780A (en) Model training processing method, device, equipment and storage medium

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination