Disclosure of Invention
In view of this, the embodiment of the application provides a method, a system and a device for converting a domain model language and a general decision table, which can effectively solve the problem of low conversion efficiency when converting a SAS ENTERPRISE MINER software script file into a decision table.
In a first aspect, an embodiment of the present application provides a method for converting a domain model language and a general decision table, including:
Based on a grammar rule analysis file, converting a decision tree model file into grammar tree structure data by using a grammar analyzer, wherein the decision tree model file is a script file derived from a model data mining tool;
traversing the grammar tree structure data and converting the grammar tree structure data into intermediate structure data according to a preset data conversion rule;
And generating a general decision table conforming to a target decision engine according to the intermediate structure data, wherein the general decision table is used for indicating the target decision engine to generate a decision tree.
In some embodiments, the intermediate structure data is java array type data, wherein the java array comprises a plurality of multi-level key value pair objects and sub-arrays which are mixed and nested, wherein each level of sub-array is used as a key value of an element in a corresponding level key value pair object, and each level of sub-array comprises at least one array element formed by a next level key value pair object;
The traversing the syntax tree structure data and converting the syntax tree structure data into intermediate structure data according to a preset data conversion rule comprises the following steps:
Traversing from a root node of the grammar tree structure data to a leaf node direction, generating multi-level key value pair objects and array elements mixed and nested by the sub-arrays according to the hierarchical relation of the sub-nodes, taking each level of node information as keys of the elements in the key value pair objects, taking the sub-arrays as the key values of the key value pair objects, respectively adopting a mode of the key value pair objects by a plurality of nodes of the next level to form the elements in the sub-arrays of the corresponding level, wherein the method specifically comprises the steps of generating the end key value pair objects according to the sub-nodes connected with the leaf nodes, taking the key values of the end key value pair objects as the end arrays, storing the leaf node information of the sub-nodes as the elements in the end arrays, and if one node is connected with the sub-nodes of a plurality of the same levels, correspondingly comprising a plurality of key value pair objects in the sub-arrays of the corresponding level;
And generating the intermediate structure data according to the array elements of the mixed nesting of the object and the subarray of each multi-level key value pair.
In some embodiments, the intermediate structure data comprises an input statement key value pair object and an output statement key value pair object, wherein the key value of an element in the input statement key value pair object is an input element array, the input element array comprises a rule key value pair object, and the terminal array in the output statement key value pair object comprises a node identifier key value pair object;
The generating a general decision table conforming to a target decision engine according to the intermediate structure data comprises the following steps:
Traversing all levels of elements in the java array to obtain the input statement key value pair object and the output statement key value pair object;
According to the information in the rule key value pair object, determining to store the input sentence information in the corresponding input sentence key value pair object into an input sentence set;
Determining to store the output sentence information in the corresponding output sentence key value pair object into an output sentence set according to the information in the node identifier key value pair object;
Converting information in each rule key value pair object in the input statement set into an information format conforming to the target decision engine to obtain a rule information list set;
And generating the general decision table according to the input statement set, the output statement set and the rule information list set.
In some embodiments, the generating the generic decision table from the set of input sentences, the set of output sentences, and the set of rule information lists comprises:
generating a first decision table array according to each input sentence information associated in the input sentence set and the corresponding output sentence information in the output sentence set;
generating a second decision table array according to the rule information list set;
the first decision table array and the second decision table array are respectively used as key values to obtain a first key value pair object and a second key value pair object;
And obtaining the general decision table according to each decision table element.
In some embodiments, the method further comprises:
Generating a custom variable applicable to the target decision engine according to the intermediate structure data, wherein the custom variable is used for indicating the target decision engine to generate the decision tree, and specifically comprises the following steps:
Traversing the intermediate structure data to obtain each preposed assignment statement node;
Acquiring a target variable set defined in the target decision engine;
Screening variable assignment statement nodes in the preposed assignment statement nodes, and acquiring corresponding target variables in the target variable set; storing the variables in the variable assignment statement and the target variables as key value pairs into preset map data;
traversing the intermediate structure data to respectively acquire if condition sentences, elseif condition sentences and else condition sentences, and correspondingly acquiring a first-level condition and a second-level condition;
Respectively recursively step by step the if condition statement, the else condition statement and the else condition statement respectively correspond to the conditions of the sub-nodes, and correspondingly obtain each first-level sub-condition, each second-level sub-condition and each third-level sub-condition;
generating a first-level combination condition according to the first-level condition and each first-level sub-condition, generating a second-level combination condition according to the second-level condition and each second-level sub-condition, and generating a third-level combination condition according to each third-level sub-condition.
In some embodiments, the step-by-step recursion of the if condition statement, the else condition statement, and the else condition statement respectively correspond to conditions of sub-nodes, respectively, to correspondingly obtain each first-level sub-condition, each second-level sub-condition, and each third-level sub-condition, including:
The if condition statement, the else condition statement and the else condition statement are respectively recursively conducted step by step to respectively correspond to the conditions of sub-nodes, the first-level sub-conditions, the second-level sub-conditions and the third-level sub-conditions are respectively pushed into a stack, traversing is carried out until a decision object node is not empty, if the decision object node is a leaf node, the first-level sub-conditions, the second-level sub-conditions and the third-level sub-conditions are respectively popped out of the stack one by one to generate all-level combined conditions.
In some embodiments, the model data mining tool is SAS ENTERPRISE MINER, the decision tree model file is a sam decision tree model file, the grammar parser is an ANTLR, and the grammar rule parse file is a g4 grammar rule parse file;
The parsing file based on grammar rules converts the decision tree model file into grammar tree structure data by using a grammar parser, and the method comprises the following steps:
obtaining a sas decision tree model file derived from SAS ENTERPRISE MINER;
Acquiring a g4 grammar rule analysis file set according to the grammar rule of the sas decision tree model file;
And analyzing the file based on the g4 grammar rule, and converting the sam decision tree model file into grammar tree structure data by utilizing ANTLR.
In a second aspect, an embodiment of the present application provides a system for converting a domain model language and a general decision table, including a model data mining tool, a grammar parser, a target decision engine, and a controller;
The controller is used for generating a general decision table according to the conversion method of the domain model language and the general decision table provided by the first aspect of the application;
the model data mining tool is used for constructing a decision tree model file;
The target decision engine is used for generating a decision tree based on the grammar parser executing the general decision table.
In some embodiments, the intermediate structure data is java array type data, wherein the java array comprises a plurality of multi-level key value pair objects and sub-arrays which are mixed and nested, wherein each level of sub-array is used as a key value of an element in a corresponding level key value pair object, and each level of sub-array comprises at least one array element formed by a next level key value pair object;
The controller is specifically configured to:
Traversing from a root node of the grammar tree structure data to a leaf node direction, generating multi-level key value pair objects and array elements mixed and nested by the sub-arrays according to the hierarchical relation of the sub-nodes, wherein each level of node information is used as a key of the elements in the key value pair objects, the sub-arrays are used as key values of the key value pair objects, and a plurality of nodes of the next level respectively adopt the key value pair object mode to form the elements in the sub-arrays of the corresponding level, wherein the node of the next level specifically comprises the key value pair objects as end arrays, storing the leaf node information of the sub-nodes as the elements in the end arrays, and if one node is connected with the sub-nodes of a plurality of levels, the sub-arrays of the corresponding level comprise a plurality of corresponding key value pair objects;
And generating the intermediate structure data according to the array elements of the mixed nesting of the object and the subarray of each multi-level key value pair.
In a third aspect, an embodiment of the present application provides a terminal device, where the terminal device includes a processor and a memory, where the memory stores a computer program, and the processor is configured to execute the computer program to implement a method for converting a domain model language and a generic decision table provided in the first aspect of the present application.
The embodiment of the application has the following beneficial effects:
The method comprises the steps of converting a decision tree model file into syntax tree structure data by using a syntax parser and a syntax rule parsing file, wherein the decision tree model file is a script file derived from a model data mining tool, the syntax rule parsing file is set according to syntax rules of the decision tree model file, traversing the syntax tree structure data and converting the syntax tree structure data into intermediate structure data according to preset data conversion rules, and generating a general decision table conforming to a target decision engine according to the intermediate structure data, wherein the general decision table is used for indicating the target decision engine to generate a decision tree. The application can automatically generate the general decision table without manually creating a decision tree in a decision engine system, has simple creation process, is not easy to make mistakes and has high conversion efficiency.
Detailed Description
The following description of the embodiments of the present application will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present application, but not all embodiments.
The components of the embodiments of the present application generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the application, as presented in the figures, is not intended to limit the scope of the application, as claimed, but is merely representative of selected embodiments of the application. All other embodiments, which can be made by a person skilled in the art without making any inventive effort, are intended to be within the scope of the present application.
The terms "comprises," "comprising," "including," or any other variation thereof, are intended to cover a specific feature, number, step, operation, element, component, or combination of the foregoing, which may be used in various embodiments of the present application, and are not intended to first exclude the presence of or increase the likelihood of one or more other features, numbers, steps, operations, elements, components, or combinations of the foregoing. Furthermore, the terms "first," "second," "third," and the like are used merely to distinguish between descriptions and should not be construed as indicating or implying relative importance.
Unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which various embodiments of the application belong. The terms (such as those defined in commonly used dictionaries) will be interpreted as having a meaning that is the same as the context of the relevant art and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein in connection with the various embodiments of the application.
Some embodiments of the present application are described in detail below with reference to the accompanying drawings. The embodiments described below and features of the embodiments may be combined with each other without conflict.
After the service personnel visually creates the decision table through the three-party modeling software, the decision table needs to be manually converted into a decision table format in the target decision engine, and then the decision table is imported into the target decision engine. The conversion process is cumbersome and time consuming, and a tool is needed to automatically implement the automatic conversion of the decision table script of the modeling software into a generic decision table in a recognizable format. Therefore, the application provides a conversion method, a conversion system and conversion equipment for a domain model language and a general decision table, so as to improve conversion efficiency and the like.
The present application provides a conversion system of domain model language and general decision table, which exemplarily comprises a controller 100, a model data mining tool 200, a grammar parser 300 and a target decision engine 400 as shown in fig. 1.
The model data mining tool 200 is used for constructing a decision tree model file.
The syntax parser 300 is configured to parse a file based on a syntax rule, and convert a decision tree model file into syntax tree structure data.
The controller 100 is configured to generate a generic decision table according to a method for converting a domain model language and a generic decision table provided by the present application.
The target decision engine 400 is configured to generate a decision tree according to the generic decision table.
The method of converting the domain model language to a generic decision table is described below in connection with specific embodiments.
FIG. 2 is a flow chart of a method for converting a domain model language to a generic decision table according to an embodiment of the application. The method for converting the domain model language and the general decision table comprises the following steps:
s100, converting a decision tree model file into syntax tree structure data by using a syntax parser 300 based on a syntax rule parsing file, wherein the decision tree model file is a script file derived from a model data mining tool 200, and the syntax rule parsing file is obtained according to the syntax rule setting of the decision tree model file. The grammar rule analysis file is used for describing the structure and grammar of the decision tree model file.
Further, exemplarily, the model data mining tool 200 is SAS ENTERPRISE MINER, the decision tree model file is a. Sas decision tree model file, the grammar parser 300 is an ANTLR, and the grammar rule parsing file is a g4 grammar rule parsing file (abbreviated as g4 file,. G4 file).
The g4 grammar rule analysis file comprises a grammar rule definition part, a sentence definition part, a grammar analysis part and a lexical analysis part, wherein numerical values, quotation marks, keywords and the like are defined in the lexical analysis part. Illustratively, the lexical analysis portion defines what is a numerical value, what is a quotation mark, what is a keyword (e.g., IF, DO, NOT, TRUE), and so on. The parsing part defines integers, decimal numbers, strings, constants, etc. The parsing part defines what is an integer, what is a decimal, what is a string, what is a constant, etc. The statement definition section specifically defines what is a method function statement, what is an assignment statement, and the like. The grammar rule definition part defines grammar rules of a sas decision tree model file (abbreviated as a sas script file). Keywords, grammar rules, statements, etc. defined in the g4 file are used to identify the content within the sas script file. Each row definition in the g4 file is automatically generated by the ANTLR (e.g., ANTLR 4) to generate a callback method, and the callback method can be reloaded by the application to realize the business logic of the application.
In step S100, the parsing file based on the grammar rule, using the grammar parser 300, converts the decision tree model file into grammar tree structure data, including:
S110, acquiring a sam decision tree model file derived from SAS ENTERPRISE MINER.
SAS ENTERPRISE MINER is a powerful data analysis and model data mining tool 200 that provides a complete data science workflow for data preprocessing, modeling, evaluation, deployment, etc. When the user completes the model build in SAS ENTERPRISE MINER, it may be exported as a SAS dataset (.sas file, file of SAS suffix). The exported SAS file structure typically includes the following parts:
(1) A data set defining section including basic information of the data set such as a name of the data set, a data type, the number of variables, and a name. Wherein, the 'project' is adopted to represent the name of a database, and the 'DATASETNAME' represents the name of a specific data set.
(2) Variable definition the variable definition section details each variable in the dataset. Each variable typically starts with a variable name, data type, LENGTH and possibly default values, e.g. VariableName TYPE =numerical length=8 DEFAULT =0. Where VariableName is a variable name, TYPE specifies a variable TYPE (e.g., ' Numeric, ' Char ' etc.), and ' Length ' specifies a LENGTH of a character TYPE variable, and ' Default ' specifies a DEFAULT value for the missing value.
(3) Record data, which is the main part of the dataset, contains all the data of the observation. Each row represents an observation and each column represents a variable. For example:
```sas
123.45"Sample Text"2023Q190
56.78"Another Text"2023Q275
```
each field is typically separated by a space, tab, or comma.
(4) An end marker, typically an end marker, will be the last of the dataset, indicating the end of the dataset: "sasRUN".
S120, acquiring a g4 grammar rule analysis file set according to the grammar rule of the sas decision tree model file.
S130, based on the g4 grammar rule analysis file, converting the sams decision tree model file into grammar tree structure data by utilizing ANTLR. The syntax tree structure data is an abstract syntax tree AST.
Specifically, a g4 grammar rule analysis file is written according to grammar rules of a sas decision tree model file (sas script file), and an idea plug-in is utilized (an ANTLR plug-in is installed in the idea, so that configuration, code generation and test of the grammar file can be more conveniently carried out, and a development process is accelerated), the g4 grammar rule analysis file is analyzed, and a plurality of java code classes are automatically generated for the g4 grammar rule analysis file. For example, the critical interface SasGrammarVisitor. Class, the critical class SasCustomVisitor class. The java code class will parse each node defined in the file for the g4 grammar rules, generating a traversal callback method. When the sas script file is actually analyzed, traversing the sas script file, and hitting which type of node, and calling back a callback method corresponding to the node. And then, the actual content of the script can be obtained in the callback method, and the parsed data is filled in corresponding fields in the intermediate structure data defined in the java.
S200, traversing the grammar tree structure data and converting the grammar tree structure data into intermediate structure data according to a preset data conversion rule.
The intermediate structure data is data of a java array type, wherein the java array comprises a plurality of multi-level key value pair objects and sub-arrays which are mixed and nested, each level of sub-array is used as a key value of an element in a corresponding level key value pair object, and each level of sub-array comprises at least one array element formed by a next level key value pair object.
For example, the java array is { "item": { "decisionItems": [ { "identifier": "_NODE_", "operator": "=", "decision item raw description originText": "_NODE_ = 33", "decision item type": "assignment", "value": "33" }, { "ifComb": { "condition": { "operator": "AND", "rule rules": [ { "parameter arguments": "'325 FDC 2D2BBA 1' x" ], "identifier": ARBFMT _12"," operator ":" IN "," originText ": ARBFMT _12 IN", "325FD2D 2BBA ' x" } ".
S300, generating a general decision table conforming to the target decision engine 400 according to the intermediate structure data, wherein the general decision table is used for indicating the target decision engine 400 to generate a decision tree.
The intermediate structure data is data of a java array type, wherein the java array comprises a plurality of multi-level key value pair objects and sub-arrays which are mixed and nested, each level of sub-array is used as a key value of an element in a corresponding level key value pair object, and each level of sub-array comprises at least one array element formed by a next level key value pair object.
In step S200, the traversing the syntax tree structure data and converting the syntax tree structure data into intermediate structure data according to a preset data conversion rule includes:
traversing from a root node of the syntax tree structure data to a leaf node direction, generating multi-level key value pair objects and array elements mixed and nested by the sub-arrays according to the hierarchical relation of the sub-nodes, taking each level of node information as keys of the elements in the key value pair objects, taking the sub-arrays as key values of the key value pair objects, respectively adopting a mode of the key value pair objects by a plurality of nodes of the next level to form the elements in the sub-arrays of the corresponding level, wherein the method specifically comprises the steps of generating the end key value pair objects according to the sub-nodes connected with the leaf nodes, taking the key values of the end key value pair objects as the end arrays, storing each leaf node information of the sub-nodes as the elements into the end arrays, and if one node is connected with the sub-nodes of a plurality of the same levels, taking the corresponding sub-arrays of the corresponding levels as the corresponding plurality of key value pair objects;
And generating the intermediate structure data according to the array elements of the mixed nesting of the object and the subarray of each multi-level key value pair.
According to elseIfComb, when traversing IFSTATEMENT in the callback abstract syntax tree AST, judging whether ELSEIFITEMS exists in IFSTATEMENT, if so, traversing ELSEIFITEMS, and placing ELSEIFITEMS corresponding information into elseIfComb of the intermediate structure data. Each attribute is assigned to a key value pair corresponding to elseIfComb in the callback function. Wherein if decision decisionItem is not null, then the node is already a leaf node.
Further, the intermediate structure data comprises an input statement key value pair object and an output statement key value pair object, wherein key values of elements in the input statement key value pair object are input element arrays, and the input element arrays comprise rule key value pair objects. The end array in the output statement key pair object comprises a node identifier key pair object.
In step S300, the generating a generic decision table according to the target decision engine 400 according to the intermediate structure data includes:
S310, traversing all levels of elements in the java array, and obtaining the input statement key value pair object and the output statement key value pair object.
S320, according to the information in the rule key value pair object, determining to store the input sentence information in the corresponding input sentence key value pair object into an input sentence set. For example, traversing one by one, determining which class of rules, such as comparison rules, in-condition rules, and function rules, are satisfied based on the node identifier key value versus rules within the object, and then placing the result into the input statement set InputVars. And the element "operator" IN the key value corresponding to the "rules" key IN the intermediate structure data is "IN", which indicates that the rule is an IN condition rule. When the method is used for constructing the decision table, the writing methods of the conditions are different in rule types, and finally, the writing methods in the decision engine are different.
S330, according to the information in the node identifier key value pair object, determining to store the output sentence information in the corresponding output sentence key value pair object into an output sentence set. For example, a sentence in which the key is identifier and the key value is "_NODE_" keyword is extracted and then placed into the output sentence set outputVars.
S340, converting the information in each rule key value pair object in the input sentence set into an information format conforming to the target decision engine 400, and obtaining a rule information list set.
For example, a conversion rule is a specific data line, for example, a comparison symbol in a code is a mathematical symbol, for example, >, <, +.gtoreq.and+..
S350, generating the general decision table according to the input statement set, the output statement set and the rule information list set.
Finally, the data structure of the decision table in the decision engine system is converted, for example, a two-dimensional array is adopted, namely, list < String > > rows= NEW ARRAYLIST < > ().
Illustratively, the general decision table of the final target decision engine 400 is:
{"headers":["input:null","input:earliestRecordCredi","output:_NODE_"],"rows":[["","","33"],["IN\"'325FD0C2D2BBA1A11A'x\"","","22"],["IN\"'325FD0C2D2BBA1A11A'x\""," Less than 64.5","11 "]). Preferably, the general decision table is excel data. Each element in "headers" is a header. "rows" is a two-dimensional array, with each record in the first dimension being a row. The data within the second dimension corresponds to each column of each row.
The intermediate structure data is converted into Excel or is led into a target decision engine 400 to automatically generate a decision tree.
Further, in step S350, the generating the general decision table according to the input sentence set, the output sentence set, and the rule information list set includes:
s351, generating a first decision table array according to each input sentence information associated in the input sentence set and the corresponding output sentence information in the output sentence set.
And S352, generating a second decision table array according to the rule information list set.
S353, the first decision table array and the second decision table array are respectively used as key values to obtain a first key value pair object and a second key value pair object, and decision table elements are obtained according to the first key value pair object and the second key value pair object.
S354, obtaining the general decision table according to each decision table element.
In one embodiment, the method further comprises:
s500, generating a custom variable applicable to the target decision engine 400 according to the intermediate structure data, wherein the custom variable is used for indicating the target decision engine 400 to generate the decision tree.
Namely traversing the intermediate structure data, and converting the condition variable and the result variable into customized variables in the target decision engine 400, wherein the method specifically comprises the following steps:
S510, traversing the intermediate structure data to obtain each front assignment statement node.
S520, a target variable set defined in the target decision engine 400 is acquired. S530, screening variable assignment statement nodes in the front assignment statement nodes, acquiring corresponding target variables in the target variable set, and storing the variables in the variable assignment statement and the target variables as key value pairs into preset map data.
For example, for the assignment statement _ ARBFMT _12=put (city_level, $12.); intermediate data structure
In the construct "_ ARBFMT _12", which the present application maps to ("_ ARBFMT _12", "city_level") and then places in the Map object, where the target decision engine 400 uses "_ ARBFMT _12" instead of "city_level", which has been defined in advance in the target decision engine 400 for use in the final transition to the decision tree. Wherein, business personnel define the good-level field in advance in the decision engine. Then a city_level is defined by name when defined in the sas software. The samfile conversion process then matches the city_level defined in the decision engine.
S540, traversing the intermediate structure data to respectively obtain if condition sentences, elseif condition sentences and else condition sentences, and correspondingly obtaining a first-level condition and a second-level condition;
S550, respectively recursively generating conditions of the sub-nodes corresponding to the if condition statement, the else condition statement and the else condition statement step by step, and correspondingly obtaining each first-level sub-condition, each second-level sub-condition and each third-level sub-condition;
S560, generating a first-level combination condition according to the first-level condition and each first-level sub-condition, generating a second-level combination condition according to the second-level condition and each second-level sub-condition, and generating a third-level combination condition according to each third-level sub-condition.
Specifically, traversing the intermediate structure data to obtain if condition sentences to obtain first-level conditions;
And generating a first-level combination condition according to the first-level condition and each first-level sub-condition.
Traversing the intermediate structure data to obtain elseif condition sentences to obtain secondary conditions, recursively generating conditions of sub-nodes corresponding to the elseif condition sentences step by step to obtain secondary sub-conditions, and generating secondary combination conditions according to the secondary conditions and the secondary sub-conditions.
Traversing the intermediate structure data to obtain else condition sentences, recursively generating conditions of sub-nodes corresponding to the elseif condition sentences step by step to obtain secondary sub-conditions, and generating secondary combination conditions according to the secondary sub-conditions.
Further, the step-by-step recursion of the if condition statement, the else condition statement, and the else condition statement respectively correspond to conditions of sub-nodes, and each first-stage sub-condition, each second-stage sub-condition, and each third-stage sub-condition are correspondingly obtained, including:
And traversing until a decision object decisionItem node is not empty, if the decision object decisionItem node is a leaf node, respectively popping each first-stage sub-condition, each second-stage sub-condition and each third-stage sub-condition from the stack one by one to generate each stage of combined condition.
For example, dependent if conditions in intermediate structure data are pushed onto the temporary stack one by one, for example, a is equal to or greater than 0, then the condition of the next node under the recursive if condition variable, for example, b is less than 10, and the relationship is "and", the two conditions are pushed onto the stack.
The termination condition of the recursive traversal is that if the decision object node decisionItem of the intermediate structure data is not empty, the node is already a leaf node, and each condition of the traversal is popped one by one from the stack, so as to form a combined condition, for example, a is greater than or equal to 20 and b is less than 10. According to the application, if conditions and else if conditions are traversed in sequence, and finally, the else conditions are traversed recursively.
The present application automatically builds the decision tree in the target decision engine 400 by parsing the script derived in SAS software. Specifically, the abstract syntax tree AST is utilized to analyze the SAS script by antlr4, the variable part, the conditional part, the branch part and the action part of the decision tree model exported by the SAS software are respectively converted into specific data structures (intermediate structure data) by utilizing the abstract syntax tree AST, the specific data structures are converted into excel, and then the excel is imported into a decision engine system to generate a decision tree.
The custom variable is the custom variable used in the SAS file after the SAS file is analyzed, and then the parameters with the same variable name are queried in the database through the variable name. The parameters have the concept of grouping. According to different classifications, splicing character strings in a fixed format, splicing the character strings in an input: packet name, parameter name (input: SAS_TEST. City_level) format, and exporting the character strings to excel.
By parsing the sas file, the matches are assignment statements and conditional statements, (matching by the grammar rules defined in G4). In the condition, input is used, and when the sas_test real service defines variables in the SAS software, the variable may be directly defined as sas_test.city_level. And then the target decision engine is matched with the variables according to the SAS_TEST.city_level in a forced matching way.
The application opens up the one-way conversion of SAS ENTERPRISE MINER data mining model tool and decision tree model in the target decision engine 400, can identify SAS ENTERPRISE MINER script language of tool, and the form of the subsequent conversion is not limited to decision tree model in the decision engine, but also can be converted into excel or sql statement (which is convenient for business to analyze and verify decision tree effect in big data platform by sql), etc. Thus, the manual construction of the decision tree is changed into the automatic construction of the decision tree rule by the system function. The prior process is tedious, easy to make mistakes and low-efficiency, and a large number of manual configurations are reduced after improvement, so that the rule of the configuration decision tree becomes high-efficiency.
Fig. 3 is a schematic structural diagram of a conversion device of a domain model language and a general decision table according to an embodiment of the present application. The conversion device for the domain model language and the general decision table comprises a syntax tree generating module 1110, an intermediate structure data generating module 1120 and a decision table generating module 1130.
The grammar tree generation module 1110 is configured to parse a file based on grammar rules, and convert a decision tree model file into grammar tree structure data by using the grammar parser 300, where the decision tree model file is a script file derived from the model data mining tool 200;
The intermediate structure data generating module 1120 is configured to traverse the syntax tree structure data and convert the syntax tree structure data into intermediate structure data according to a preset data conversion rule;
The decision table generating module 1130 is configured to generate a generic decision table according to the intermediate structure data, where the generic decision table is used to instruct the target decision engine 400 to generate a decision tree.
It will be appreciated that the apparatus of this embodiment corresponds to the method for converting the domain model language and the generic decision table of the above embodiment, and the options in the above embodiment are also applicable to this embodiment, so the description thereof will not be repeated here.
The application also provides a terminal device, which exemplarily comprises a processor and a memory, wherein the memory stores a computer program, and the processor executes the computer program, so that the terminal device executes the functions of each module in the conversion method of the domain model language and the general decision table or the conversion device of the domain model language and the general decision table.
The processor may be an integrated circuit chip with signal processing capabilities. The processor may be a general purpose processor including at least one of a central processing unit (Central Processing Unit, CPU), a graphics processor (Graphics Processing Unit, GPU) and a network processor (Network Processor, NP), a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic device, discrete hardware components. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like that may implement or perform the methods, steps, and logic blocks disclosed in embodiments of the present application.
The Memory may be, but is not limited to, random access Memory (Random Access Memory, RAM), read Only Memory (ROM), programmable Read Only Memory (Programmable Read-Only Memory, PROM), erasable Read Only Memory (Erasable Programmable Read-Only Memory, EPROM), electrically erasable Read Only Memory (Electric Erasable Programmable Read-Only Memory, EEPROM), etc. The memory is used for storing a computer program, and the processor can correspondingly execute the computer program after receiving the execution instruction.
The present application also provides a readable storage medium storing the computer program for use in the above terminal device.
In the several embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other manners. The apparatus embodiments described above are merely illustrative, for example, of the flow diagrams and block diagrams in the figures, which illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
In addition, functional modules or units in various embodiments of the application may be integrated together to form a single part, or the modules may exist alone, or two or more modules may be integrated to form a single part.
The functions, if implemented in the form of software functional modules and sold or used as a stand-alone product, may be stored in a computer-readable storage medium. Based on such understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a smart phone, a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present application. The storage medium includes a U disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, an optical disk, or other various media capable of storing program codes.
The foregoing is merely illustrative of the present application, and the present application is not limited thereto, and any person skilled in the art will readily recognize that variations or substitutions are within the scope of the present application.