CN106503863A - Based on the Forecasting Methodology of the age characteristicss of decision-tree model, system and terminal - Google Patents
Based on the Forecasting Methodology of the age characteristicss of decision-tree model, system and terminal Download PDFInfo
- Publication number
- CN106503863A CN106503863A CN201610989789.7A CN201610989789A CN106503863A CN 106503863 A CN106503863 A CN 106503863A CN 201610989789 A CN201610989789 A CN 201610989789A CN 106503863 A CN106503863 A CN 106503863A
- Authority
- CN
- China
- Prior art keywords
- model parameter
- training
- decision
- age
- model
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000003066 decision tree Methods 0.000 title claims abstract description 63
- 238000000034 method Methods 0.000 title claims abstract description 27
- 238000012549 training Methods 0.000 claims abstract description 175
- 238000012360 testing method Methods 0.000 claims abstract description 73
- 230000003542 behavioural effect Effects 0.000 claims description 11
- 230000000694 effects Effects 0.000 claims description 8
- 230000011218 segmentation Effects 0.000 claims description 6
- 238000000605 extraction Methods 0.000 claims description 4
- 241000208340 Araliaceae Species 0.000 claims description 3
- 235000005035 Panax pseudoginseng ssp. pseudoginseng Nutrition 0.000 claims description 3
- 235000003140 Panax quinquefolius Nutrition 0.000 claims description 3
- 235000008434 ginseng Nutrition 0.000 claims description 3
- 239000007787 solid Substances 0.000 abstract description 5
- 230000006399 behavior Effects 0.000 description 5
- 230000008569 process Effects 0.000 description 5
- 230000018199 S phase Effects 0.000 description 3
- 238000010168 coupling process Methods 0.000 description 3
- 238000005859 coupling reaction Methods 0.000 description 3
- 238000011161 development Methods 0.000 description 3
- 230000018109 developmental process Effects 0.000 description 3
- 230000009471 action Effects 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 2
- 238000004891 communication Methods 0.000 description 2
- 230000008878 coupling Effects 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 230000000630 rising effect Effects 0.000 description 2
- 241000282994 Cervidae Species 0.000 description 1
- 241001269238 Data Species 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000013480 data collection Methods 0.000 description 1
- 230000004069 differentiation Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 238000003672 processing method Methods 0.000 description 1
- 238000007711 solidification Methods 0.000 description 1
- 230000008023 solidification Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/04—Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/951—Indexing; Web crawling techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/02—Marketing; Price estimation or determination; Fundraising
- G06Q30/0201—Market modelling; Market analysis; Collecting market data
Landscapes
- Engineering & Computer Science (AREA)
- Business, Economics & Management (AREA)
- Strategic Management (AREA)
- Theoretical Computer Science (AREA)
- Development Economics (AREA)
- Physics & Mathematics (AREA)
- Accounting & Taxation (AREA)
- Entrepreneurship & Innovation (AREA)
- Finance (AREA)
- General Physics & Mathematics (AREA)
- Economics (AREA)
- Marketing (AREA)
- General Business, Economics & Management (AREA)
- Game Theory and Decision Science (AREA)
- Data Mining & Analysis (AREA)
- Human Resources & Organizations (AREA)
- Databases & Information Systems (AREA)
- Operations Research (AREA)
- Quality & Reliability (AREA)
- Tourism & Hospitality (AREA)
- General Engineering & Computer Science (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The present invention provides a kind of Forecasting Methodology of the age characteristicss based on decision-tree model, system and terminal.The method includes:Collection basic data information;The feature input variable and target variable in the basic data information attribute is extracted, sample data is obtained;Sample data is divided into training set and test set, training set is input in decision-tree model carries out model parameter training, by model parameter training result, is applied to test set, test meets the model parameter training result of self-defined stability condition;The model parameter training result for meeting self-defined stability condition is exported;The age that the rule of the model parameter training result of output is regularly updated as unknown subscriber is predicted the outcome.A kind of Forecasting Methodology of age characteristicss based on decision-tree model that the present invention is provided, system and terminal, forecast model is built, the age of user is predicted, user's portrait is accurately built, solid data basis are laid for scenes such as marketing, the accuracy of age identification is improve.
Description
Technical field
A kind of the present invention relates to e-commerce field, more particularly to prediction side of the age characteristicss based on decision-tree model
Method, system and terminal.
Background technology
In the relevant industries based on user such as electric business or social activity, the real age for understanding user is generally required,
To classify to which, its behavior characteristics and preference is conveniently studied.But network is used as a virtual world, user when enjoying a trip to,
Itself there is a kind of vigilance, so as to hide oneself part true identity.
But the colony of all ages and classes, the behavior characteristicss of itself have distinction, its real age can pass through itself row
It is characterized and shows, targetedly behavior characteristicss data is processed and feature extraction, year can be greatly improved
The accuracy of age prediction.
In prior art, age prediction is carried out using regression model.Inventor has found in the course of the study, using recurrence
When model carries out age prediction, as the age is a continuous variable in itself, the result for predicting not is that very accurately reason is such as
Under:
1st, the user of age-grade value, often due to residing living environment affects, causes its external behavior characteristicss difference
Larger;
2nd, the user of all ages and classes value, the user being especially of the similar age, behavior characteristicss often do not have significantly differentiation
Property;
Ultimately result in, the error of forecast of regression model real age is larger.
Content of the invention
Present invention is primarily targeted at provide a kind of Forecasting Methodology of the age characteristicss based on decision-tree model, system and
Terminal, to overcome the age characteristicss of existing e-commerce field to predict difficult technical problem.
One aspect of the present invention provides a kind of Forecasting Methodology of the age characteristicss based on decision-tree model, including:
Collection basic data information;
The feature input variable and target variable in the basic data information attribute is extracted, sample data is obtained;
Sample data is divided into training set and test set, training set is input in decision-tree model carries out model parameter instruction
Practice, by model parameter training result, be applied to test set, test meets the model parameter training knot of self-defined stability condition
Really;
The model parameter training result for meeting self-defined stability condition is exported;Model parameter training result by output
Age for regularly updating as unknown subscriber of rule predict the outcome.
Further, the collection basic data information, including but not limited to:Collection log-on message, access behavioral data,
Place an order the basic data of behavioral data and/or artist.
Further, the feature input variable and target variable extracted in the basic data information attribute, obtains
Sample data, including:
Obtain all properties information in basic data information;
Related to age prediction at least one input variable in the attribute information and at least one target variable is extracted,
And at least one input variable and at least one target variable arrangement are obtained sample data.
Further, described sample data is divided into training set and test set, training set is input in decision-tree model
Model parameter training is carried out, by model parameter training result, test set is applied to, test meets the mould of self-defined stability condition
Shape parameter training result, including:
Sample data is divided into the training set for modeling and the test set for verifying modelling effect;
Training set is input in decision-tree model carries out model parameter training;
By model parameter training result, test set is applied to, test meets test set accuracy rate in training set accuracy rate
The model parameter training result of the stability condition within self-defined middle percentage ratio.
Further, training set is input in decision-tree model carries out model parameter training, including:
The feature input variable of training set is input in decision-tree model, decision-tree model is carried out based on information gain-ratio
Variable selection and segmentation point selection, carry out model parameter training.
Further, the model parameter training result for meeting self-defined stability condition is exported;Model ginseng by output
The age that the rule of number training results is regularly updated as unknown subscriber predicts the outcome, including:
The mould of stability condition of the test set accuracy rate within the self-defined middle percentage ratio of training set accuracy rate will be met
Shape parameter training result is exported;The where conditions that the rule of the model parameter training result of output is organized into SQL, are deployed to
Regularly updating the age for unknown subscriber in system predicts the outcome.
Another aspect of the present invention additionally provides a kind of prognoses system of the age characteristicss based on decision-tree model, including:
Acquisition module, for gathering basic data information;
Abstraction module, for extracting feature input variable and target variable in the basic data information attribute, obtains
Sample data;
MBM, for sample data is divided into training set and test set, training set is input in decision-tree model
Model parameter training is carried out, by model parameter training result, test set is applied to, test meets the mould of self-defined stability condition
Shape parameter training result;
Output module, for exporting the model parameter training result for meeting self-defined stability condition;Mould by output
The rule of shape parameter training result regularly updates the age for unknown subscriber and predicts the outcome.
Further, the acquisition module, including but not limited to:
Collecting unit, for the basic number for gathering log-on message, access behavioral data, place an order behavioral data and/or artist
According to.
Further, the abstraction module, including:
Acquiring unit, for obtaining all properties information in basic data information;
Extracting unit, for extracting related to age prediction at least one input variable in the attribute information and at least
One target variable, and at least one input variable and at least one target variable arrangement are obtained sample data.
Further, the MBM, including:
Training unit, for being divided into the training set for modeling and the test for verifying modelling effect by sample data
Collection;Training set is input in decision-tree model carries out model parameter training;
Modeling unit, for by model parameter training result, being applied to test set, test meets test set accuracy rate in instruction
The model parameter training result of the stability condition within the self-defined middle percentage ratio of white silk collection accuracy rate.
Further, training unit, including:
Training subelement, is input in decision-tree model for the feature input variable by training set, decision-tree model base
Variable selection and segmentation point selection are carried out in information gain-ratio, model parameter training is carried out.
Further, output module, including:
Output unit is steady within the self-defined middle percentage ratio of training set accuracy rate for will meet test set accuracy rate
The model parameter training result output of qualitative condition;
Updating block, the rule for the model parameter training result by output are organized into the where conditions of SQL, deployment
Regularly updating the age for unknown subscriber in system predicts the outcome.
Another aspect of the present invention additionally provides a kind of prediction terminal of the age characteristicss based on decision-tree model, including aforementioned
System described in any one.
The present invention is by gathering basic data information;Extract feature input variable in the basic data information attribute and
Target variable, obtains sample data;Sample data is divided into training set and test set, training set is input in decision-tree model
Model parameter training is carried out, by model parameter training result, test set is applied to, test meets the mould of self-defined stability condition
Shape parameter training result;The model parameter training result for meeting self-defined stability condition is exported;Model parameter by output
The rule of training result regularly updates the age for unknown subscriber and predicts the outcome, and using the basic data information of collection, builds pre-
Model is surveyed, the age of user is predicted, user's portrait is accurately built, is that the scenes such as marketing lay solid data basis, is improved
The accuracy of age identification.
Description of the drawings
Fig. 1 is according to a kind of embodiment one of the Forecasting Methodology of the age characteristicss based on decision-tree model according to the present invention
Flow chart;
Fig. 2 is according to a kind of embodiment two of the prognoses system of the age characteristicss based on decision-tree model according to the present invention
Structured flowchart;
Fig. 3 is according to a kind of acquisition module of the prognoses system of the age characteristicss based on decision-tree model according to the present invention
The structured flowchart of embodiment two;
Fig. 4 is according to a kind of abstraction module of the prognoses system of the age characteristicss based on decision-tree model according to the present invention
The structured flowchart of embodiment two;
Fig. 5 is according to a kind of MBM of the prognoses system of the age characteristicss based on decision-tree model according to the present invention
The structured flowchart of embodiment two;
Fig. 6 is according to a kind of training unit of the prognoses system of the age characteristicss based on decision-tree model according to the present invention
The structured flowchart of embodiment two;
Fig. 7 is according to a kind of output module of the prognoses system of the age characteristicss based on decision-tree model according to the present invention
The structured flowchart of embodiment two;
Fig. 8 is according to a kind of embodiment three of the prediction terminal of the age characteristicss based on decision-tree model according to the present invention
Structured flowchart.
Specific embodiment
In order that those skilled in the art more fully understand the present invention program, below in conjunction with the embodiment of the present invention in
Accompanying drawing, to the embodiment of the present invention in technical scheme be clearly and completely described, it is clear that described embodiment is only
The embodiment of a part of the invention, rather than whole embodiments.Embodiment in based on the present invention, ordinary skill people
The every other embodiment obtained under the premise of creative work is not made by member, should all belong to the model of present invention protection
Enclose.
It should be noted that description and claims of this specification and the term " first " in above-mentioned accompanying drawing, "
Two " it is etc. for distinguishing similar object, without for describing specific order or precedence.It should be appreciated that so using
Data can exchange in the appropriate case, so as to embodiments of the invention described herein can with except here diagram or
Order beyond those of description is implemented.Additionally, term " comprising " and " having " and their any deformation, it is intended that cover
Lid is non-exclusive to be included, and for example, the process, method, system, product or the equipment that contain series of steps or unit are not necessarily limited to
Those steps that clearly lists or unit, but may include clearly not list or for these processes, method, product
Or intrinsic other steps of equipment or unit.
Decision tree (Decision Tree) is on the basis of known various situation probability of happening, by constituting decision tree
To ask for probability of the expected value of net present value (NPV) more than or equal to zero, assessment item risk judges the method for decision analysis of its feasibility,
It is a kind of diagram method intuitively with probability analyses.As this decision branch is drawn as branch of the figure like one tree, therefore claim
Decision tree.In machine learning, decision tree is a forecast model, and what he represented is the one kind between object properties and object value
Mapping relations.The clutter of Entropy=systems, using algorithm ID3, C4.5 and C5.0 spanning tree algorithms use entropy.This
Tolerance is the concept based on entropy in information theory.
Decision tree is a kind of tree structure, and wherein each internal node represents the test on an attribute, each branch's generation
One test output of table, each leaf node represent a kind of classification.
Age of user in conjunction with the business characteristic of industry itself, is cleverly layered by the present invention by decision-tree model,
Continuous variable is converted to discrete variable, regression problem is converted into classification problem, targetedly data are processed and
Feature extraction, modeling, finally both ensure that operational availability, had improved the accuracy rate of prediction again.
Embodiment one
As shown in figure 1, one aspect of the present invention provides a kind of Forecasting Methodology of the age characteristicss based on decision-tree model, bag
Include:Step S110, step S120, step S130 and step S140.
In step s 110, basic data information is gathered.
The collection basic data information, including but not limited to:Collection log-on message, access behavioral data, lower single act
Data and/or the basic data of artist.
In the step s 120, the feature input variable and target variable in the basic data information attribute is extracted, is obtained
Sample data.
Including:
Obtain all properties information in basic data information;Extract related extremely to age prediction in the attribute information
A few input variable and at least one target variable, and at least one input variable and at least one target are become
Amount arrangement obtains sample data.
Preferably, related eight to age prediction in the sequence information of extraction user, the Back ground Information attribute information of artist
Input variable:
Whether √ user bought parent-offspring's intermediate item;
√ user's history highest level of consumption;
√ user preference artist's numbers;
Whether preference is combined √ user;
√ user preference artist's sexes;
√ user preferences artist development state (such as deer break, Chinese, development state are Korea);
√ user preference artist's ages;
√ user preferences artist development ground;
One target variable:Age of user is layered (trinary variable [student's phase, rising stage, stable phase]):
Student's phase:Age<=22;
Rising stage:22<Age<=30;
Stable phase:Age>30;
In step s 130, sample data is divided into training set and test set, training set is input in decision-tree model
Model parameter training is carried out, by model parameter training result, test set is applied to, test meets the mould of self-defined stability condition
Shape parameter training result.
Including:
Sample data is divided into the training set for modeling and the test set for verifying modelling effect;Wherein, sample number
According to sample data when being for modeling, if there is 1,000,000 sample datas, need to be classified as 2 parts, a part be for
Modeling, a part is for verifying modelling effect, e.g., training set:Test set=6:4 and 6:4 ratio is to adjust
, but in general, training set accounting can be relatively higher, and conventional division proportion is 5:5,6:4,7:3,75:25 etc..
Training set is input in decision-tree model carries out model parameter training;By model parameter training result, it is applied to
Test set, test meet the mould of stability condition of the test set accuracy rate within the self-defined middle percentage ratio of training set accuracy rate
Shape parameter training result.
By model parameter training result, test set is applied to, checks the stability of training result.That is, overall accuracy rate and
Training result is consistent, and specifically consistent standard can be weighed according to the acceptable degree of concrete scene, be referenced as:Test
Collection accuracy rate is within ± the 10% of training set accuracy rate.
Wherein, training set is input in decision-tree model carries out model parameter training, including:Will be defeated for the feature of training set
Enter variable to be input in decision-tree model, decision-tree model carries out variable selection and segmentation point selection based on information gain-ratio, enters
Row model parameter is trained.
In step S140, the model parameter training result for meeting self-defined stability condition is exported;Mould by output
The rule of shape parameter training result regularly updates the age for unknown subscriber and predicts the outcome.
Including:
The mould of stability condition of the test set accuracy rate within the self-defined middle percentage ratio of training set accuracy rate will be met
Shape parameter training result is exported;The where conditions that the rule of the model parameter training result of output is organized into SQL, are deployed to
Regularly updating the age for unknown subscriber in system predicts the outcome.
One concrete application example, for the user A at unknown age, it is assumed that meet following conditions:User preference artist is born
Age>Nineteen ninety-five and user preference artist is born the age<1998, then judge that user A is student's phase.
One practical application example, decision-tree model have many algorithms, but in system solidification can directly invoke correlation
Algorithm bag is trained, as long as adjusting relevant parameter, meets actual requirement.
The embodiment of the present invention one is by gathering basic data information;The feature extracted in the basic data information attribute is defeated
Enter variable and target variable, obtain sample data;Sample data is divided into training set and test set, training set is input to decision-making
Model parameter training is carried out in tree-model, by model parameter training result, test set is applied to, and test meets self-defined stability
The model parameter training result of condition;The model parameter training result for meeting self-defined stability condition is exported;By output
The rule of model parameter training result regularly updates the age for unknown subscriber and predicts the outcome, using the basic data letter of collection
Breath, builds forecast model, predicts the age of user, accurately builds user's portrait, is that the scenes such as marketing lay solid data
Basis, improves the accuracy of age identification.
Embodiment two
As shown in Fig. 2 another aspect of the present invention additionally provides a kind of prediction system of the age characteristicss based on decision-tree model
System 200, including:
Acquisition module 21, for gathering basic data information.
Abstraction module 22, for extracting feature input variable and target variable in the basic data information attribute, obtains
Arrive sample data.
Training set, for sample data is divided into training set and test set, is input to decision-tree model by MBM 23
In carry out model parameter training, by model parameter training result, be applied to test set, test meets self-defined stability condition
Model parameter training result.
Output module 24, for exporting the model parameter training result for meeting self-defined stability condition;By output
The rule of model parameter training result regularly updates the age for unknown subscriber and predicts the outcome.
Further, as shown in figure 3, the acquisition module 21, including but not limited to:
Collecting unit 211, for the basis for gathering log-on message, access behavioral data, place an order behavioral data and/or artist
Data.
Further, as shown in figure 4, the abstraction module 22, including:
Acquiring unit 221, for obtaining all properties information in basic data information;
Extracting unit 222, for extract related to age prediction at least one input variable in the attribute information and
At least one target variable, and at least one input variable and at least one target variable arrangement are obtained sample number
According to.
Further, as shown in figure 5, the MBM 23, including:
Training unit 231, for being divided into the training set for modeling and the survey for verifying modelling effect by sample data
Examination collection;Training set is input in decision-tree model carries out model parameter training;
Modeling unit 232, for by model parameter training result, being applied to test set, test meets test set accuracy rate
The model parameter training result of the stability condition within the self-defined middle percentage ratio of training set accuracy rate.
Further, as shown in fig. 6, training unit 231, including:
Training subelement 2311, is input in decision-tree model for the feature input variable by training set, decision tree mould
Type carries out variable selection and segmentation point selection based on information gain-ratio, carries out model parameter training.
Further, as shown in fig. 7, output module 24, including:
Output unit 241, for will meet test set accuracy rate within the self-defined middle percentage ratio of training set accuracy rate
Stability condition model parameter training result output;
Updating block 242, the rule for the model parameter training result by output are organized into the where conditions of SQL, portion
Affixing one's name to the age regularly updated in system as unknown subscriber predicts the outcome.
The embodiment of the present invention two gathers basic data information by acquisition module;The basic number is extracted by abstraction module
Feature input variable and target variable according to information attribute, obtains sample data;Sample data is divided into by MBM
Training set and test set, training set is input in decision-tree model carries out model parameter training, by model parameter training result,
Test set is applied to, test meets the model parameter training result of self-defined stability condition;Output module will meet self-defined
The model parameter training result output of stability condition;The rule of the model parameter training result of output is regularly updated as unknown
The age of user predicts the outcome, and using the basic data information of collection, builds forecast model, predicts the age of user, accurately
User's portrait is built, is that the scenes such as marketing lay solid data basis, is improve the accuracy of age identification.
Embodiment three
As shown in figure 8, another aspect of the present invention additionally provides a kind of prediction end of the age characteristicss based on decision-tree model
End 300, including the system 200 described in two any one of embodiment.
The embodiment of the present invention three is by gathering basic data information;The feature extracted in the basic data information attribute is defeated
Enter variable and target variable, obtain sample data;Sample data is divided into training set and test set, training set is input to decision-making
Model parameter training is carried out in tree-model, by model parameter training result, test set is applied to, and test meets self-defined stability
The model parameter training result of condition;The model parameter training result for meeting self-defined stability condition is exported;By output
The rule of model parameter training result regularly updates the age for unknown subscriber and predicts the outcome, using the basic data letter of collection
Breath, builds forecast model, predicts the age of user, accurately builds user's portrait, is that the scenes such as marketing lay solid data
Basis, improves the accuracy of age identification.
The embodiments of the present invention are for illustration only, do not represent the quality of embodiment.
It should be noted that for aforesaid each method embodiment, in order to be briefly described, therefore which is all expressed as a series of
Combination of actions, but those skilled in the art should know, the present invention do not limited by described sequence of movement because
According to the present invention, some steps can be carried out using other orders or simultaneously.Secondly, those skilled in the art should also know
Know, embodiment described in this description belongs to preferred embodiment, involved action and module are not necessarily of the invention
Necessary.
In the above-described embodiments, the description of each embodiment is all emphasized particularly on different fields, in certain embodiment, there is no the portion that describes in detail
Point, may refer to the associated description of other embodiment.
In several embodiments provided herein, it should be understood that disclosed device, can be by another way
Realize.For example, device embodiment described above is only the schematically division of for example described unit, is only one kind
Division of logic function, can have when actually realizing other dividing mode, for example multiple units or component can in conjunction with or can
To be integrated into another system, or some features can be ignored, or not execute.Another, shown or discussed each other
Coupling or direct-coupling or communication connection can be INDIRECT COUPLING or communication connection by some interfaces, device or unit,
Can be electrical or other forms.
The unit that illustrates as separating component can be or may not be physically separate, aobvious as unit
The part for showing can be or may not be physical location, you can be located at a place, or can also be distributed to multiple
On NE.Some or all of unit therein can be selected according to the actual needs to realize the mesh of this embodiment scheme
's.
In addition, each functional unit in each embodiment of the invention can be integrated in a processing unit, it is also possible to
It is that unit is individually physically present, it is also possible to which two or more units are integrated in a unit.Above-mentioned integrated list
Unit both can be realized in the form of hardware, it would however also be possible to employ the form of SFU software functional unit is realized.
It may be noted that according to the needs that implements, each step/part described in this application can be split as more multistep
The part operation of two or more step/parts or step/part also can be combined into new step/part by suddenly/part,
To realize the purpose of the present invention.
Above-mentioned the method according to the invention can be realized in hardware, firmware, or is implemented as being storable in recording medium
Software or computer code in (such as CD ROM, RAM, floppy disk, hard disk or magneto-optic disk), or it is implemented through network download
Original storage in long-range recording medium or nonvolatile machine readable media and the meter in local recording medium will be stored in
Calculation machine code, so as to method described here can be stored in using general purpose computer, application specific processor or programmable or special
With the such software processes in the recording medium of hardware (such as ASIC or FPGA).It is appreciated that computer, processor, micro-
Processor controller or programmable hardware include can storing or receive software or computer code storage assembly (for example, RAM,
ROM, flash memory etc.), when the software or computer code by computer, processor or hardware access and execute when, realize here
The processing method of description.Additionally, when general purpose computer accesses the code of the process being shown in which for realization, the execution of code
The special-purpose computer that general purpose computer is converted to the process being shown in which for execution.
The above, the only specific embodiment of the present invention, but protection scope of the present invention is not limited thereto, any
Those familiar with the art the invention discloses technical scope in, change or replacement can be readily occurred in, should all be contained
Cover within protection scope of the present invention.Therefore, protection scope of the present invention should be defined by the scope of the claims.
Claims (13)
1. a kind of Forecasting Methodology of the age characteristicss based on decision-tree model, it is characterised in that include:
Collection basic data information;
The feature input variable and target variable in the basic data information attribute is extracted, sample data is obtained;
Sample data is divided into training set and test set, training set is input in decision-tree model carries out model parameter training,
By model parameter training result, test set is applied to, test meets the model parameter training result of self-defined stability condition;
The model parameter training result for meeting self-defined stability condition is exported;Rule by the model parameter training result of output
Then regularly updating the age for unknown subscriber predicts the outcome.
2. the method for claim 1, it is characterised in that the collection basic data information, including but not limited to:Collection
Log-on message, the basic data for accessing behavioral data, placing an order behavioral data and/or artist.
3. method as claimed in claim 1 or 2, it is characterised in that the spy in the extraction basic data information attribute
Input variable and target variable is levied, sample data is obtained, including:
Obtain all properties information in basic data information;
Related to age prediction at least one input variable in the attribute information and at least one target variable is extracted, and will
At least one input variable and at least one target variable are arranged and obtain sample data.
4. the method as described in one of claim 1-3, it is characterised in that described sample data is divided into training set and test
Collection, training set is input in decision-tree model carries out model parameter training, by model parameter training result, is applied to test
Collection, test meet the model parameter training result of self-defined stability condition, including:
Sample data is divided into the training set for modeling and the test set for verifying modelling effect;
Training set is input in decision-tree model carries out model parameter training;
By model parameter training result, test set is applied to, test meets test set accuracy rate making by oneself in training set accuracy rate
The model parameter training result of the stability condition in justice within percentage ratio.
5. method as claimed in claim 4, it is characterised in that training set is input in decision-tree model carries out model parameter
Training, including:
The feature input variable of training set is input in decision-tree model, decision-tree model carries out variable based on information gain-ratio
Select and segmentation point selection, carry out model parameter training.
6. the method as described in one of claim 1-4, it is characterised in that the model parameter of self-defined stability condition will be met
Training result is exported;The age that the rule of the model parameter training result of output is regularly updated as unknown subscriber is predicted the outcome,
Including:
The model for meeting stability condition of the test set accuracy rate within the self-defined middle percentage ratio of training set accuracy rate is joined
Number training result output;The where conditions that the rule of the model parameter training result of output is organized into SQL, are deployed to system
In age for regularly updating as unknown subscriber predict the outcome.
7. a kind of prognoses system of the age characteristicss based on decision-tree model, it is characterised in that include:
Acquisition module, for gathering basic data information;
Abstraction module, for extracting feature input variable and target variable in the basic data information attribute, obtains sample
Data;
MBM, for sample data is divided into training set and test set, training set is input in decision-tree model to be carried out
Model parameter is trained, and by model parameter training result, is applied to test set, and test meets the model ginseng of self-defined stability condition
Number training result;
Output module, for exporting the model parameter training result for meeting self-defined stability condition;Model ginseng by output
The age that the rule of number training results is regularly updated as unknown subscriber predicts the outcome.
8. system as claimed in claim 7, it is characterised in that the acquisition module, including but not limited to:
Collecting unit, for the basic data for gathering log-on message, access behavioral data, place an order behavioral data and/or artist.
9. system as claimed in claim 7 or 8, it is characterised in that the abstraction module, including:
Acquiring unit, for obtaining all properties information in basic data information;
Extracting unit, for extracting related to age prediction at least one input variable and at least one in the attribute information
Target variable, and at least one input variable and at least one target variable arrangement are obtained sample data.
10. system as claimed in claim 7, it is characterised in that the MBM, including:
Training unit, for being divided into the training set for modeling and the test set for verifying modelling effect by sample data;Will
Training set is input in decision-tree model carries out model parameter training;
Modeling unit, for by model parameter training result, being applied to test set, test meets test set accuracy rate in training set
The model parameter training result of the stability condition within the self-defined middle percentage ratio of accuracy rate.
11. systems as claimed in claim 10, it is characterised in that training unit, including:
Training subelement, is input in decision-tree model for the feature input variable by training set, and decision-tree model is based on letter
Breath ratio of profit increase carries out variable selection and segmentation point selection, carries out model parameter training.
12. systems as claimed in claim 7, it is characterised in that output module, including:
Output unit, for will meet stability of the test set accuracy rate within the self-defined middle percentage ratio of training set accuracy rate
The model parameter training result output of condition;
Updating block, the rule for the model parameter training result by output are organized into the where conditions of SQL, are deployed to and are
Regularly updating the age for unknown subscriber in system predicts the outcome.
A kind of 13. prediction terminals of the age characteristicss based on decision-tree model, including as described in any one of claim 7-12
System.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610989789.7A CN106503863A (en) | 2016-11-10 | 2016-11-10 | Based on the Forecasting Methodology of the age characteristicss of decision-tree model, system and terminal |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610989789.7A CN106503863A (en) | 2016-11-10 | 2016-11-10 | Based on the Forecasting Methodology of the age characteristicss of decision-tree model, system and terminal |
Publications (1)
Publication Number | Publication Date |
---|---|
CN106503863A true CN106503863A (en) | 2017-03-15 |
Family
ID=58323996
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610989789.7A Pending CN106503863A (en) | 2016-11-10 | 2016-11-10 | Based on the Forecasting Methodology of the age characteristicss of decision-tree model, system and terminal |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106503863A (en) |
Cited By (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106951692A (en) * | 2017-03-06 | 2017-07-14 | 北京师范大学 | Exemplary resource elements recognition method and device |
CN107273883A (en) * | 2017-05-03 | 2017-10-20 | 天方创新(北京)信息技术有限公司 | Decision-tree model training method, determine data attribute method and device in OCR result |
CN107316204A (en) * | 2017-05-27 | 2017-11-03 | 银联智惠信息服务(上海)有限公司 | Recognize humanized method, device, computer-readable medium and the system of holding |
CN108549954A (en) * | 2018-03-26 | 2018-09-18 | 平安科技(深圳)有限公司 | Risk model training method, risk identification method, device, equipment and medium |
CN109376932A (en) * | 2018-10-30 | 2019-02-22 | 平安医疗健康管理股份有限公司 | Age prediction method, device, server and storage medium based on prediction model |
CN109508558A (en) * | 2018-10-31 | 2019-03-22 | 阿里巴巴集团控股有限公司 | A kind of verification method and device of data validity |
CN109657482A (en) * | 2018-10-26 | 2019-04-19 | 阿里巴巴集团控股有限公司 | A kind of verification method and device of data validity |
CN110415020A (en) * | 2019-07-01 | 2019-11-05 | 北京三快在线科技有限公司 | Age prediction method, device and electronic equipment |
WO2019218751A1 (en) * | 2018-05-16 | 2019-11-21 | 阿里巴巴集团控股有限公司 | Processing method, apparatus and device for risk prediction of insurance service |
CN111279304A (en) * | 2017-09-29 | 2020-06-12 | 甲骨文国际公司 | Method and system for configuring a communication decision tree based on connected locatable elements on a canvas |
CN111325372A (en) * | 2018-12-13 | 2020-06-23 | 北京京东尚科信息技术有限公司 | Prediction model establishment method, prediction method, device, medium and equipment |
CN111340276A (en) * | 2020-02-19 | 2020-06-26 | 联想(北京)有限公司 | Method and system for generating prediction data |
CN112712900A (en) * | 2021-01-08 | 2021-04-27 | 昆山杜克大学 | Physiological age prediction model based on machine learning and establishment method thereof |
CN113838578A (en) * | 2021-09-27 | 2021-12-24 | 南方医科大学珠江医院 | Big data-based trachea foreign matter emergency treatment device with striking hammer |
CN113947417A (en) * | 2020-07-15 | 2022-01-18 | 上海哔哩哔哩科技有限公司 | Training method and device of age identification model and age identification method and device |
US11775843B2 (en) | 2017-09-29 | 2023-10-03 | Oracle International Corporation | Directed trajectories through communication decision tree using iterative artificial intelligence |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20100312724A1 (en) * | 2007-11-02 | 2010-12-09 | Thomas Pinckney | Inferring user preferences from an internet based social interactive construct |
WO2011019731A2 (en) * | 2009-08-10 | 2011-02-17 | Mintigo Ltd. | Systems and methods for gererating leads in a network by predicting properties of external nodes |
CN102859967A (en) * | 2010-03-01 | 2013-01-02 | 诺基亚公司 | Method and apparatus for estimating user characteristics based on user interaction data |
CN103886074A (en) * | 2014-03-24 | 2014-06-25 | 江苏名通信息科技有限公司 | Commodity recommendation system based on social media |
CN103927675A (en) * | 2014-04-18 | 2014-07-16 | 北京京东尚科信息技术有限公司 | Method and device for judging age brackets of users |
CN104598607A (en) * | 2015-01-29 | 2015-05-06 | 百度在线网络技术(北京)有限公司 | Method and system for recommending search phrase |
CN106022800A (en) * | 2016-05-16 | 2016-10-12 | 北京百分点信息科技有限公司 | User feature data processing method and device |
-
2016
- 2016-11-10 CN CN201610989789.7A patent/CN106503863A/en active Pending
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20100312724A1 (en) * | 2007-11-02 | 2010-12-09 | Thomas Pinckney | Inferring user preferences from an internet based social interactive construct |
WO2011019731A2 (en) * | 2009-08-10 | 2011-02-17 | Mintigo Ltd. | Systems and methods for gererating leads in a network by predicting properties of external nodes |
CN102859967A (en) * | 2010-03-01 | 2013-01-02 | 诺基亚公司 | Method and apparatus for estimating user characteristics based on user interaction data |
CN103886074A (en) * | 2014-03-24 | 2014-06-25 | 江苏名通信息科技有限公司 | Commodity recommendation system based on social media |
CN103927675A (en) * | 2014-04-18 | 2014-07-16 | 北京京东尚科信息技术有限公司 | Method and device for judging age brackets of users |
CN104598607A (en) * | 2015-01-29 | 2015-05-06 | 百度在线网络技术(北京)有限公司 | Method and system for recommending search phrase |
CN106022800A (en) * | 2016-05-16 | 2016-10-12 | 北京百分点信息科技有限公司 | User feature data processing method and device |
Cited By (24)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106951692A (en) * | 2017-03-06 | 2017-07-14 | 北京师范大学 | Exemplary resource elements recognition method and device |
CN107273883B (en) * | 2017-05-03 | 2020-04-21 | 天方创新(北京)信息技术有限公司 | Decision tree model training method, and method and device for determining data attributes in OCR (optical character recognition) result |
CN107273883A (en) * | 2017-05-03 | 2017-10-20 | 天方创新(北京)信息技术有限公司 | Decision-tree model training method, determine data attribute method and device in OCR result |
CN107316204A (en) * | 2017-05-27 | 2017-11-03 | 银联智惠信息服务(上海)有限公司 | Recognize humanized method, device, computer-readable medium and the system of holding |
CN111279304B (en) * | 2017-09-29 | 2023-08-15 | 甲骨文国际公司 | Method and system for configuring communication decision tree based on locatable elements connected on canvas |
US11900267B2 (en) | 2017-09-29 | 2024-02-13 | Oracle International Corporation | Methods and systems for configuring communication decision trees based on connected positionable elements on canvas |
CN111279304A (en) * | 2017-09-29 | 2020-06-12 | 甲骨文国际公司 | Method and system for configuring a communication decision tree based on connected locatable elements on a canvas |
US11775843B2 (en) | 2017-09-29 | 2023-10-03 | Oracle International Corporation | Directed trajectories through communication decision tree using iterative artificial intelligence |
CN108549954A (en) * | 2018-03-26 | 2018-09-18 | 平安科技(深圳)有限公司 | Risk model training method, risk identification method, device, equipment and medium |
CN108549954B (en) * | 2018-03-26 | 2022-08-02 | 平安科技(深圳)有限公司 | Risk model training method, risk identification device, risk identification equipment and risk identification medium |
WO2019218751A1 (en) * | 2018-05-16 | 2019-11-21 | 阿里巴巴集团控股有限公司 | Processing method, apparatus and device for risk prediction of insurance service |
CN109657482A (en) * | 2018-10-26 | 2019-04-19 | 阿里巴巴集团控股有限公司 | A kind of verification method and device of data validity |
CN109657482B (en) * | 2018-10-26 | 2022-11-18 | 创新先进技术有限公司 | Data validity verification method, device and equipment |
CN109376932A (en) * | 2018-10-30 | 2019-02-22 | 平安医疗健康管理股份有限公司 | Age prediction method, device, server and storage medium based on prediction model |
CN109508558B (en) * | 2018-10-31 | 2022-11-18 | 创新先进技术有限公司 | Data validity verification method, device and equipment |
CN109508558A (en) * | 2018-10-31 | 2019-03-22 | 阿里巴巴集团控股有限公司 | A kind of verification method and device of data validity |
CN111325372A (en) * | 2018-12-13 | 2020-06-23 | 北京京东尚科信息技术有限公司 | Prediction model establishment method, prediction method, device, medium and equipment |
CN111325372B (en) * | 2018-12-13 | 2025-04-18 | 北京京东尚科信息技术有限公司 | Method for establishing prediction model, prediction method, device, medium and equipment |
CN110415020A (en) * | 2019-07-01 | 2019-11-05 | 北京三快在线科技有限公司 | Age prediction method, device and electronic equipment |
CN111340276B (en) * | 2020-02-19 | 2022-08-19 | 联想(北京)有限公司 | Method and system for generating prediction data |
CN111340276A (en) * | 2020-02-19 | 2020-06-26 | 联想(北京)有限公司 | Method and system for generating prediction data |
CN113947417A (en) * | 2020-07-15 | 2022-01-18 | 上海哔哩哔哩科技有限公司 | Training method and device of age identification model and age identification method and device |
CN112712900A (en) * | 2021-01-08 | 2021-04-27 | 昆山杜克大学 | Physiological age prediction model based on machine learning and establishment method thereof |
CN113838578A (en) * | 2021-09-27 | 2021-12-24 | 南方医科大学珠江医院 | Big data-based trachea foreign matter emergency treatment device with striking hammer |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106503863A (en) | Based on the Forecasting Methodology of the age characteristicss of decision-tree model, system and terminal | |
CN111181939B (en) | A network intrusion detection method and device based on ensemble learning | |
CN116228021A (en) | Mine ecological restoration evaluation analysis method and system based on environment monitoring | |
CN109063984B (en) | Method, apparatus, computer device and storage medium for risky travelers | |
CN109583468A (en) | Training sample acquisition methods, sample predictions method and corresponding intrument | |
CN108833139B (en) | An OSSEC Alarm Data Aggregation Method Based on Category Attribute Division | |
CN107169768A (en) | The acquisition methods and device of abnormal transaction data | |
CN109919781A (en) | Case recognition methods, electronic device and computer readable storage medium are cheated by clique | |
CN110008259A (en) | The method and terminal device of visualized data analysis | |
CN106843941B (en) | Information processing method and device and computer equipment | |
CN112437053B (en) | Intrusion detection method and device | |
CN112329816A (en) | Data classification method and device, electronic equipment and readable storage medium | |
CN112818162B (en) | Image retrieval method, device, storage medium and electronic equipment | |
CN105574544A (en) | Data processing method and device | |
CN113852204A (en) | Three-dimensional panoramic monitoring system and method for transformer substation with digital twin | |
CN109886554A (en) | Unlawful practice method of discrimination, device, computer equipment and storage medium | |
CN117156442A (en) | Cloud data security protection method and system based on 5G network | |
CN112801231A (en) | Decision model training method and device for business object classification | |
CN114187036B (en) | Internet advertisement intelligent recommendation management system based on behavior characteristic recognition | |
CN113726558A (en) | Network equipment flow prediction system based on random forest algorithm | |
CN109995611B (en) | Traffic classification model establishing and traffic classification method, device, equipment and server | |
Ayhan et al. | Analysis of image classification methods for remote sensing | |
CN111210158B (en) | Target address determining method, device, computer equipment and storage medium | |
CN115392582B (en) | Crop yield prediction method based on incremental fuzzy rough set attribute reduction | |
CN112580780A (en) | Model training processing method, device, equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |