[go: up one dir, main page]

CN113806429B - Canvas type log analysis method based on big data stream processing frame - Google Patents

Canvas type log analysis method based on big data stream processing frame Download PDF

Info

Publication number
CN113806429B
CN113806429B CN202010533924.3A CN202010533924A CN113806429B CN 113806429 B CN113806429 B CN 113806429B CN 202010533924 A CN202010533924 A CN 202010533924A CN 113806429 B CN113806429 B CN 113806429B
Authority
CN
China
Prior art keywords
canvas
type
stream processing
operator
data stream
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010533924.3A
Other languages
Chinese (zh)
Other versions
CN113806429A (en
Inventor
陈飞
赖键锋
廖子渊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sangfor Technologies Co Ltd
Original Assignee
Sangfor Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sangfor Technologies Co Ltd filed Critical Sangfor Technologies Co Ltd
Priority to CN202010533924.3A priority Critical patent/CN113806429B/en
Publication of CN113806429A publication Critical patent/CN113806429A/en
Application granted granted Critical
Publication of CN113806429B publication Critical patent/CN113806429B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • G06F16/258Data format conversion from or to a database
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2471Distributed queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/248Presentation of query results

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Fuzzy Systems (AREA)
  • Mathematical Physics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Software Systems (AREA)
  • Debugging And Monitoring (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The application discloses a canvas type log analysis method, a canvas type log analysis system and a storage medium based on a big data stream processing frame, wherein the canvas type log analysis method based on the big data stream processing frame comprises the following steps: acquiring an operator of a canvas working area according to the dragging operation of a user on a canvas; forming a workflow of log analysis according to the operator; converting the workflow into a corresponding Json format file; and generating the Json format file into a distributed program running on the executor, so that the executor executes the distributed program to analyze log data. The problems that the operation is complex, the reliability and the stability cannot be guaranteed and the maintenance is difficult when the open source program of the big data stream processing frame is used for log analysis in the prior art are solved, and the effect of simplicity and high efficiency in operation of the big data stream processing frame log analysis is achieved.

Description

Canvas type log analysis method based on big data stream processing frame
Technical Field
The application relates to the technical field of big data, in particular to a canvas type log analysis method based on a big data stream processing frame.
Background
Along with the rapid development of internet technology, the generated data also presents gushing type development, and how to utilize the big data to carry out production and other arrangements is of great significance to enterprises.
At present, a large data stream processing platform can be used for processing data in the production process of enterprises, so that analysis of stream data is realized. However, the current big data flow processing platform mainly performs data analysis through the component operation flow of the open source system, the operation is complex, the framework and the principle of the big data flow processing platform are required to be known, and the big data flow processing platform is difficult for business personnel without big data base to use.
Disclosure of Invention
The embodiment of the application aims to solve the problems of complex operation, uncontrollable reliability and stability and difficult maintenance when using an open source program of a large data stream processing platform to perform log analysis in the prior art by providing the canvas type log analysis method, the canvas type log analysis system and the storage medium based on the large data stream processing frame, and achieves the effects of simplicity and high efficiency in operation of large data stream processing frame log analysis.
In order to achieve the above object, the present application provides a canvas-type log analysis method based on a big data stream processing frame, the canvas-type log analysis method based on the big data stream processing frame comprising the steps of:
Acquiring an operator of a canvas working area according to the dragging operation of a user on a canvas;
Forming a workflow of log analysis according to the operator;
Converting the workflow into a corresponding Json format file;
And generating the Json format file into a distributed program running on the executor, so that the executor executes the distributed program to analyze log data.
Optionally, the step of generating the Json format file into a distributed program running on an executor includes:
Converting the Json format file into a DSL description file;
generating a corresponding job flow graph according to the DSL description file;
And generating a distributed program operated by the executor according to the job flow graph.
Optionally, before the step of generating the distributed program running by the executor according to the job flow graph, the method includes:
and carrying out operation character integration, verification and rule matching operation on the operation flow diagram.
Optionally, before the step of generating the distributed program running by the executor according to the job flow graph, the method includes:
and constructing an execution environment according to the type of the job flow graph, and setting parameter configuration.
Optionally, the step of constructing the execution environment according to the type of the job flow graph includes:
Judging the type of the operation flow graph according to a source operator contained in the operation flow graph;
if the type of the job flow graph is a batch processing type, constructing an execution environment corresponding to batch processing;
If the type of the job flow graph is the flow processing type, an execution environment corresponding to the flow processing is constructed.
Optionally, after the step of constructing the execution environment according to the type of the job flow graph and setting the parameter configuration, the method includes:
Collecting data for parameter configuration;
and formatting and field operating on the data.
Optionally, after the step of acquiring the operator of the canvas work area according to the drag operation of the user in the canvas, the method comprises the following steps:
if the operator comprises a custom operator;
Acquiring a jar dependent file corresponding to the custom operator;
and storing the jar dependent file and writing a storage path into the Json file.
In addition, in order to achieve the above object, the present application also provides a system for canvas-type log analysis based on a big data stream processing frame, characterized in that the system comprises a canvas-type log analysis terminal based on a big data stream processing frame; wherein,
The canvas type log analysis terminal based on the big data stream processing frame comprises a memory, a processor and a canvas type log analysis program based on the big data stream processing frame, wherein the canvas type log analysis program based on the big data stream processing frame is stored on the memory and can run on the processor, and when being executed by the processor, the canvas type log analysis program based on the big data stream processing frame realizes the following steps:
Acquiring an operator of a canvas working area according to the dragging operation of a user on a canvas;
Forming a workflow of log analysis according to the operator;
Converting the workflow into a corresponding Json format file;
And generating the Json format file into a distributed program running on the executor, so that the executor executes the distributed program to analyze log data.
In addition, in order to achieve the above object, the present application further provides a canvas-type log analysis device based on a big data stream processing framework, which is characterized in that the device comprises:
The acquisition module acquires an operator of a canvas working area according to the dragging operation of a user on the canvas;
the engine module converts the workflow into a corresponding Json format file;
And the executor module is used for generating the Json format file into a distributed program running on the executor so that the executor executes the distributed program to analyze log data.
In addition, in order to achieve the above object, the present application provides a computer readable storage medium, wherein a canvas-based log analysis program based on a large data stream processing frame is stored on the storage medium, and when the canvas-based log analysis program based on the large data stream processing frame is executed by a processor, the method according to any one of the above is implemented.
In the embodiment, operators of a canvas working area are obtained according to the dragging operation of a user on a canvas, a workflow for log analysis is formed according to the operators, and the workflow is converted into a corresponding Json format file; and generating the Json format file into a distributed program running on the executor, so that the executor executes the distributed program to analyze log data. The operator of the drawing canvas is dragged to the working area of the canvas to complete log analysis, codes are not required to be written through a large data stream processing program, the canvas type log analysis based on the large data stream processing frame can be realized without learning and knowing the frame and the principle of the frame, and the effect of simplifying the operation of the large data log analysis is achieved.
Drawings
FIG. 1 is a schematic diagram of a terminal structure of a hardware operating environment according to an embodiment of the present application;
FIG. 2 is a flow diagram of an embodiment of a canvas-based journal analysis method based on a big data stream processing framework in accordance with the present application;
FIG. 3 is a flow diagram of another embodiment of a canvas-based log analysis method based on a big data stream processing framework of the present application;
FIG. 4 is a flow diagram of another embodiment of a canvas-type log analysis method based on a big data stream processing framework in accordance with the present application;
FIG. 5 is a flow diagram of an execution environment constructed by the type of job flow graph of the present application;
FIG. 6 is a flow chart diagram of the present application after the steps of configuring an execution environment, setting parameter configuration, according to the type of the job flow graph;
FIG. 7 is a flow diagram of a canvas-type log analysis method based on a big data stream processing framework according to another embodiment of the present application.
Detailed Description
It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the application.
The main solutions of the embodiments of the present application are: acquiring an operator of a canvas working area according to the dragging operation of a user on a canvas; forming a workflow of log analysis according to the operator; converting the workflow into a corresponding Json format file; and generating the Json format file into a distributed program running on the executor, so that the executor executes the distributed program to analyze log data. . When the prior art uses a large data stream processing platform to perform log analysis, the importing from a data source, the processing of data and the analysis result are realized by writing codes in the large data stream processing platform, and the log analysis realized by the method has the problems of complex operation, unable guarantee of reliability and stability and difficult maintenance. Therefore, the application constructs the distributed program which can run on the executor by dragging the big data stream processing frame operator which is already packaged to the working area of the canvas, realizes the log analysis based on the big data stream processing frame without writing codes on the big data stream processing platform, and achieves the effect of simple and efficient operation of the big data stream processing frame log analysis.
As shown in fig. 1, fig. 1 is a schematic diagram of a terminal structure of a hardware running environment according to an embodiment of the present application.
As shown in fig. 1, the terminal may include: a processor 1001, such as a CPU, a network interface 1004, a user interface 1003, a memory 1005, a communication bus 1002. Wherein the communication bus 1002 is used to enable connected communication between these components. The user interface 1003 may include a Display, an input unit such as a Keyboard (Keyboard), and the optional user interface 1003 may further include a standard wired interface, a wireless interface. The network interface 1004 may optionally include a standard wired interface, a wireless interface (e.g., WI-FI interface). The memory 1005 may be a high-speed RAM memory or a stable memory (non-volatile memory), such as a disk memory. The memory 1005 may also optionally be a storage device separate from the processor 1001 described above.
Optionally, the terminal may also include a camera, RF (Radio Frequency) circuitry, sensors, television 1006, audio circuitry, wiFi modules, detectors, and so forth. Of course, the terminal may be further configured with other sensors such as a gyroscope, a barometer, a hygrometer, a temperature sensor, etc., which will not be described herein.
It will be appreciated by those skilled in the art that the terminal structure shown in fig. 1 is not limiting of the terminal device and may include more or fewer components than shown, or may combine certain components, or a different arrangement of components.
As shown in FIG. 1, an operating system, a network communication module, a user interface module, and a canvas-like log analysis program based on a large data stream processing framework may be included in memory 1005, which is a computer-readable storage medium.
In the terminal shown in fig. 1, the network interface 1004 is mainly used for connecting to a background server and performing data communication with the background server; the user interface 1003 is mainly used for connecting a client (user side) and performing data communication with the client; and the processor 1001 may be configured to invoke the large data stream processing framework based canvas journal analysis program stored in the memory 1005 and perform the following operations:
Acquiring an operator of a canvas working area according to the dragging operation of a user on a canvas;
Forming a workflow of log analysis according to the operator;
Converting the workflow into a corresponding Json format file;
And generating the Json format file into a distributed program running on the executor, so that the executor executes the distributed program to analyze log data.
Referring to fig. 2, fig. 2 is a flow chart of an embodiment of a canvas-based journal analysis method based on a big data stream processing frame according to the present application, the canvas-based journal analysis method based on a big data stream processing frame includes:
step S10, acquiring operators of a canvas working area according to the dragging operation of a user on a canvas;
step S20, forming a workflow of log analysis according to the operator;
the canvas is an operation interface of the canvas type log analysis system based on the big data stream processing frame, the design of the canvas type log analysis system can be divided into an upper part and a lower part, wherein the packed big data stream processing frame operator is used as a basic operator to be designed at the upper end part of the interface, the lower part is a canvas working area, and a user selects an operator corresponding to log analysis by using a mouse clicking mode and drags the operator to the working area of the canvas. It will be appreciated that the design of the canvas may be selected by the operator according to actual requirements. The acquiring is to acquire an operator of a working area through a program in the setting system. Wherein the types of the operators include a source operator (Kafka source operator), a processing operator (parsing rule operator, association rule operator, big data stream processing framework SQL operator, union operator), a destination operator (search engine destination operator, kafka destination operator), and a custom operator (Async operator). When log analysis is performed by dragging data in a canvas, the analysis of the log needs to be completed by using different types of operators.
The user needs to connect operators of different types according to the execution sequence according to the log analysis work to be processed, so as to form a corresponding workflow.
Step S30, converting the workflow into a corresponding Json format file;
The Json format file is a data exchange format file, and can be easily analyzed and generated by a computer. Its grammatical forms include Json objects, json arrays, and Json nesting. In the canvas type log analysis method system based on the big data stream processing framework, the workflow formed by connection between operators in the working area of the canvas is converted into a Json format file corresponding to the workflow.
And S40, generating the Json format file into a distributed program running on an executor, so that the executor executes the distributed program to analyze log data.
After generating the Json-format file, the system passes the Json-format file to the engine layer, which further processes the Json-format file to generate a distributed program that can run on the executor.
In the present application, a Flink stream data processing platform is taken as an example. When a user uses the log analysis system to analyze data, a fly source operator which is packaged by the system in the system can be dragged into a canvas to guide the data to be processed into the fly, and then the analysis operation on the data is carried out according to the requirement, and a corresponding processing operator is dragged from a control column of the system. In general, external data can be imported to the Flink by dragging the source operator when the data source of the data is acquired from the outside. Further, a drag processing operator (transformation) converts the data, and the converted data is written to an external data source by a destination operator. It should be noted that a link Job is generally composed of a source operator, a processing operator and a destination operator, that is, the process of importing, analyzing and storing analysis results of stream data is realized by dragging three operators in the present application. Wherein, the processing operators can comprise a plurality of data flow directions can be determined according to the execution sequence of the operators.
In this embodiment, when the system detects that the user drags the operators, the operator of the canvas working area and the connection mode between the operators are obtained, a workflow primarily formed by log analysis to be executed is obtained according to the connection mode between the operators, the workflow is converted into a computer description language Json format file, and further a distributed program capable of running on the executor is generated according to the Json format file, so that the executor completes the analysis of the log. The user can analyze the log by dragging the package operator on the canvas, program codes are not required to be written in the large data stream processing platform, the log analysis can be completed, and the effect of simplicity and high efficiency in operation of large data stream processing frame log analysis is achieved.
Referring to fig. 3, fig. 3 is a schematic diagram of another embodiment of the present application, where the step of generating the Json format file into a distributed program running on an executor includes:
Step S41, converting the Json format file into a DSL description file;
Step S42, generating a corresponding operation flow chart according to the DSL description file;
Step S43, generating a distributed program operated by the executor according to the job flow diagram.
The DSL is a language that describes objects, rules and modes of operation in a particular domain. For example, when there is an SQL statement in Json, the SQL statement is eventually handed to the corresponding database for processing. The database reads useful messages from the SQL statement and returns corresponding results. By converting the Json format file into the DSL description, a job flow graph is generated during execution, and a corresponding distributed program running on the executor is generated according to the job flow graph, where the job flow graph is a data structure representing a job identified by a data flow engine of the large data flow processing framework.
Before the step of generating the distributed program operated by the executor according to the job flow diagram, the method comprises the following steps:
And S44, performing operation character integration, verification and rule matching operation on the operation flow diagram.
It will be appreciated that there may be duplicate operators (multiple source operators) in a workflow graph generated in DSL language, and further analysis and optimization of the workflow graph may be required, where analysis and optimization includes operations such as operator integration, verification, rule matching, etc. on the workflow graph. For example, when a user needs to analyze data of a plurality of logs, a plurality of Kafka source operators are dragged to respectively import corresponding log data, and when data processing is performed, if used processing operators are the same (same operation is performed), the data can be analyzed by only selecting a command corresponding to one operator, so that unreasonable resource configuration caused by repeated operation is avoided.
In this embodiment, when there are multiple data sources, that is, repeated operator nodes in the workflow diagram, it is analyzed whether the operators are the same data source, and further, optimization processing is performed on the operators, so that the same operation instruction is prevented from being repeatedly executed in the executor, and the speed of log analysis by the system is reduced.
Referring to fig. 4, fig. 4 is a flowchart of another embodiment of the present application, before the step of performing the operations of operator integration, verification, and rule matching on the job flow graph, the method includes:
Step S45, an execution environment is constructed according to the type of the job flow graph, and parameter configuration is set.
In this embodiment, the job flow graphs generated according to the DSL description are of two types, that is, a job flow graph corresponding to the stream processing job and a job flow graph corresponding to the batch processing job. Configuration parameters of the job are set according to two different job flow graph types.
Referring to fig. 5, the steps of constructing an execution environment according to a type of a job flow graph include:
step S451, judging the type of the job flow graph according to the source operator contained in the job flow graph;
step S452, if the type of the job flow graph is a batch type, constructing an execution environment corresponding to batch;
in step S453, if the type of the job flow graph is a flow processing type, an execution environment corresponding to the flow processing is constructed.
The batch processing operation is that the data of the log is bordered, and the stream processing operation is that the data of the log is borderless. It will be appreciated that the batch process may be performed sequentially or in parallel one after the other for a series of related tasks, for example, each year of internet music will store a record of songs listened to by the user in the past year as a data source for the batch process, and an analysis calculation is performed to obtain a data about the use of the user as a data output. The stream processing is a series of continuously changing data, and is characterized in that the data of the data source is dynamic, and log analysis, such as real-time recommendation, is required to be performed according to the data source in real time.
According to the method, the type of the data flow graph to be processed is judged according to the type of the source operator contained in the data flow graph, the engine is controlled to construct a corresponding execution environment, the configuration parameters of the operation are set, the API packaging of the operator and the transfer and conversion of the schema are completed by traversing the whole pipeline, and finally the operation is submitted to the distributed engine for execution.
In a specific embodiment, the application also provides a batch flow fusion method based on asynchronous IO, which constructs an execution environment capable of simultaneously carrying out flow processing and batch processing and supports the inquiry and the use of batch processing data in the flow processing process. Specifically, the system can support a typical batch stream fusion scene by introducing an asynchronous IO operator, efficiently interact with an external system, and can access an external database to query the full historical data in the stream data processing process. Batch fusion can support a typical Lambda architecture: on the one hand, stream data is processed in real time through a stream processing engine, a real-time result is generated and enters a stream data middleware for real-time inquiry and display, and meanwhile, the stream data is written into a data storage middleware for full analysis; on the other hand, the stream data is stored into a MongoDB or Redis database and the like through a batch processing engine, and the result of the batch processing engine is obtained through asynchronously inquiring the database in the process of processing the stream data by the stream processing engine.
Referring to fig. 6, after the step of constructing the execution environment according to the type of the job flow graph and setting the parameter configuration, the method includes:
Step S46, collecting data for parameter configuration;
step S47, formatting and field-manipulating the data.
In this embodiment, when according to different job flows, data needing parameter configuration is formatted and field operated.
In this embodiment, an execution environment corresponding to the operation flow graph is constructed according to different operation flow graph types, so that parameter configuration is performed on the operation flow graph, and a scenario based on batch operation type and stream operation type is provided, so that batch data is queried and used in the stream processing process.
Referring to fig. 7, fig. 7 is a flowchart of another embodiment of the present application, after the step of obtaining the operator of the canvas work area according to the drag operation of the user in the canvas, the method includes:
Step S50, if the operator comprises a user-defined operator;
In this embodiment, the operator may be a specific operation performed by the user by dragging the operator to the working area of the drawing board according to the operator in the actuator, where the performing operation may be a specific operation in an alternative embodiment, such as an addition, subtraction, multiplication, division, etc., an operation of obtaining a maximum value, a minimum value, an average value, etc., or an operation of building a mathematical model by the operator, such as a cyclic operation model. It can be understood that the large data stream processing frame and the API which are frequently used in processing log analysis are packaged into operators which can realize log analysis in a dragging mode on a canvas in the system, namely, the operators are designed in a toolbar area of the canvas according to functional attributes, and the custom operators, namely, users, drag the custom operators to a working area of the canvas when the operators of the toolbar cannot meet analysis requirements according to the required processed logs.
Step S60, obtaining a jar dependent file corresponding to the user-defined operator;
Step S70, storing the jar dependent file and writing the path into the Json file.
It can be understood that when a user needs to use a custom operator to complete log analysis, the custom operator does not have programming needed to be implemented in the system, that is, no operation which corresponds to the operation needed to be implemented in the custom operator and can be run on an executor in the system, for this purpose, the user can import the jar dependent file corresponding to the custom operator into the system in the import command of the operation log analysis system, and write the storage path of the corresponding jar dependent file into the Json file, so that when the Json file is analyzed, the client program can submit the whole jar contained in the workflow to the cluster according to the resource request parameters in the analysis Json file, and perform resource allocation.
In this embodiment, the user may complete analysis of the log by importing a custom operator, which is not limited to the existing operators of the system, and is convenient to operate.
In addition, in order to achieve the above object, the present application also provides a system for canvas-type log analysis based on a big data stream processing frame, characterized in that the system comprises a canvas-type log analysis terminal based on a big data stream processing frame; wherein,
The canvas type log analysis terminal based on the big data stream processing frame comprises a memory, a processor and a canvas type log analysis program based on the big data stream processing frame, wherein the canvas type log analysis program based on the big data stream processing frame is stored on the memory and can run on the processor, and when being executed by the processor, the canvas type log analysis program based on the big data stream processing frame realizes the following steps:
Acquiring an operator of a canvas working area according to the dragging operation of a user on a canvas;
Forming a workflow of log analysis according to the operator;
Converting the workflow into a corresponding Json format file;
And generating the Json format file into a distributed program running on the executor, so that the executor executes the distributed program to analyze log data.
In addition, in order to achieve the above object, the present application also provides a canvas-type log analysis device based on a big data stream processing framework, the device comprising:
the acquisition module acquires an operator of the canvas working area;
The engine module forms a corresponding Json format file according to the operator;
and the executor module is used for generating the Json format file into a distributed program running on the executor so as to enable the executor to analyze the analyzed log data.
In addition, in order to achieve the above object, the present application also provides a computer-readable storage medium having stored thereon a large data stream processing frame-based canvas journal analysis method which, when executed by a processor, implements the method as described above.
It will be appreciated by those skilled in the art that embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
It should be noted that in the claims, any reference signs placed between parentheses shall not be construed as limiting the claim. The word "comprising" does not exclude the presence of elements or steps not listed in a claim. The word "a" or "an" preceding an element does not exclude the presence of a plurality of such elements. The application may be implemented by means of hardware comprising several distinct elements, and by means of a suitably programmed computer. In the unit claims enumerating several means, several of these means may be embodied by one and the same item of hardware. The use of the words first, second, third, etc. do not denote any order. These words may be interpreted as names.
While preferred embodiments of the present application have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. It is therefore intended that the following claims be interpreted as including the preferred embodiments and all such alterations and modifications as fall within the scope of the application.
It will be apparent to those skilled in the art that various modifications and variations can be made to the present application without departing from the spirit or scope of the application. Thus, it is intended that the present application also include such modifications and alterations insofar as they come within the scope of the appended claims or the equivalents thereof.

Claims (7)

1. A method of canvas-like log analysis based on a large data stream processing framework, the method comprising:
Acquiring an operator of a canvas working area according to the dragging operation of a user on a canvas;
Forming a workflow of log analysis according to the operator;
Converting the workflow into a corresponding Json format file;
Converting the Json format file into a DSL description file;
generating a corresponding operation flow diagram according to the DSL description file, wherein the operation flow diagram is a data structure for expressing operation identified by a data flow engine of a big data flow processing framework;
Constructing an execution environment according to the type of the job flow graph, setting parameter configuration, including: judging the type of a job flow graph according to a source operator contained in the job flow graph, if the type of the job flow graph is a batch processing type, constructing an execution environment corresponding to batch processing, and if the type of the job flow graph is a stream processing type, constructing the execution environment corresponding to stream processing;
And generating a distributed program operated by the executor according to the job flow graph so that the executor executes the distributed program to analyze log data.
2. The method of canvas journal analysis based on a big data stream processing framework according to claim 1, wherein prior to the step of generating a distributed program for execution by an executor from the job flow graph, comprising:
and carrying out operation character integration, verification and rule matching operation on the operation flow diagram.
3. The canvas journal analysis method based on a big data stream processing framework according to claim 1, wherein after the step of constructing an execution environment according to the type of job flow graph and setting parameter configuration, comprising:
Collecting data for parameter configuration;
and formatting and field operating on the data.
4. The canvas log analysis method based on a big data stream processing framework according to claim 1, wherein after the step of obtaining operators of a canvas work area according to a drag operation of a user in a canvas, comprising:
if the operator comprises a custom operator;
Acquiring a jar dependent file corresponding to the custom operator;
and storing the jar dependent file and writing a storage path into the Json file.
5. A system for canvas-based journal analysis based on a big data stream processing framework, the system comprising a canvas-based journal analysis terminal based on a big data stream processing framework; wherein,
The canvas type log analysis terminal based on the big data stream processing frame comprises a memory, a processor and a canvas type log analysis program based on the big data stream processing frame, wherein the canvas type log analysis program based on the big data stream processing frame is stored on the memory and can run on the processor, and when being executed by the processor, the canvas type log analysis program based on the big data stream processing frame realizes the following steps:
Acquiring an operator of a canvas working area according to the dragging operation of a user on a canvas;
Forming a workflow of log analysis according to the operator;
Converting the workflow into a corresponding Json format file;
Converting the Json format file into a DSL description file;
generating a corresponding operation flow diagram according to the DSL description file, wherein the operation flow diagram is a data structure for expressing operation identified by a data flow engine of a big data flow processing framework;
Constructing an execution environment according to the type of the job flow graph, setting parameter configuration, including: judging the type of a job flow graph according to a source operator contained in the job flow graph, if the type of the job flow graph is a batch processing type, constructing an execution environment corresponding to batch processing, and if the type of the job flow graph is a stream processing type, constructing the execution environment corresponding to stream processing;
And generating a distributed program operated by the executor according to the job flow graph so that the executor executes the distributed program to analyze log data.
6. A large data stream processing framework based canvas-like log analysis apparatus, the apparatus comprising:
The acquisition module acquires an operator of a canvas working area according to the dragging operation of a user on the canvas;
the engine module forms a workflow of log analysis according to the operator; converting the workflow into a corresponding Json format file; converting the Json format file into a DSL description file; generating a corresponding operation flow diagram according to the DSL description file, wherein the operation flow diagram is a data structure for expressing operation identified by a data flow engine of a big data flow processing framework; constructing an execution environment according to the type of the job flow graph, setting parameter configuration, including: judging the type of a job flow graph according to a source operator contained in the job flow graph, if the type of the job flow graph is a batch processing type, constructing an execution environment corresponding to batch processing, and if the type of the job flow graph is a stream processing type, constructing the execution environment corresponding to stream processing;
and the executor module is used for generating a distributed program operated by the executor according to the job flow graph so that the executor executes the distributed program to analyze log data.
7. A computer readable storage medium, wherein the storage medium has stored thereon a large data stream processing framework based canvas journal analysis program which when executed by a processor implements the method of any one of claims 1 to 4.
CN202010533924.3A 2020-06-11 2020-06-11 Canvas type log analysis method based on big data stream processing frame Active CN113806429B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010533924.3A CN113806429B (en) 2020-06-11 2020-06-11 Canvas type log analysis method based on big data stream processing frame

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010533924.3A CN113806429B (en) 2020-06-11 2020-06-11 Canvas type log analysis method based on big data stream processing frame

Publications (2)

Publication Number Publication Date
CN113806429A CN113806429A (en) 2021-12-17
CN113806429B true CN113806429B (en) 2024-10-11

Family

ID=78892158

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010533924.3A Active CN113806429B (en) 2020-06-11 2020-06-11 Canvas type log analysis method based on big data stream processing frame

Country Status (1)

Country Link
CN (1) CN113806429B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115499303A (en) * 2022-08-29 2022-12-20 浪潮软件科技有限公司 Log analysis tool based on Flink
CN115357309B (en) * 2022-10-24 2023-07-14 深信服科技股份有限公司 Data processing method, device, system and computer readable storage medium
CN116501386B (en) * 2023-03-31 2024-01-26 中国船舶集团有限公司第七一九研究所 Automatic calculation program solving method based on data pool and related device

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107678790A (en) * 2016-07-29 2018-02-09 华为技术有限公司 Flow calculation methodologies, apparatus and system
CN109522138A (en) * 2018-11-14 2019-03-26 北京中电普华信息技术有限公司 A kind of processing method and system of distributed stream data
CN110941467A (en) * 2019-11-06 2020-03-31 第四范式(北京)技术有限公司 Data processing method, device and system

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8832125B2 (en) * 2010-06-14 2014-09-09 Microsoft Corporation Extensible event-driven log analysis framework
US10643144B2 (en) * 2015-06-05 2020-05-05 Facebook, Inc. Machine learning system flow authoring tool
KR101955376B1 (en) * 2016-12-29 2019-03-08 서울대학교산학협력단 Processing method for a relational query in distributed stream processing engine based on shared-nothing architecture, recording medium and device for performing the method
CN110704290B (en) * 2019-09-27 2024-02-13 百度在线网络技术(北京)有限公司 Log analysis method and device
CN111209309B (en) * 2020-01-13 2023-03-10 腾讯科技(深圳)有限公司 Method, device and equipment for determining processing result of data flow graph and storage medium

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107678790A (en) * 2016-07-29 2018-02-09 华为技术有限公司 Flow calculation methodologies, apparatus and system
CN109522138A (en) * 2018-11-14 2019-03-26 北京中电普华信息技术有限公司 A kind of processing method and system of distributed stream data
CN110941467A (en) * 2019-11-06 2020-03-31 第四范式(北京)技术有限公司 Data processing method, device and system

Also Published As

Publication number Publication date
CN113806429A (en) 2021-12-17

Similar Documents

Publication Publication Date Title
CN107943463B (en) Interactive mode automation big data analysis application development system
CN113806429B (en) Canvas type log analysis method based on big data stream processing frame
CN112199086B (en) Automatic programming control system, method, device, electronic equipment and storage medium
CN111666296B (en) SQL data real-time processing method and device based on Flink, computer equipment and medium
CN110908641B (en) Visualization-based stream computing platform, method, device and storage medium
CN102609451B (en) SQL (structured query language) query plan generation method oriented to streaming data processing
CN108984155B (en) Data processing flow setting method and device
CN105550268A (en) Big data process modeling analysis engine
CN113467771B (en) Model-based industrial edge cloud collaboration system and method
CN104331640A (en) Biocloud platform-based project conclusion report analysis system and method
AU2017254506B2 (en) Method, apparatus, computing device and storage medium for data analyzing and processing
CN113961183B (en) Visual programming method, device, equipment and storage medium
CN112286957B (en) API application method and system of BI system based on structured query language
US11379499B2 (en) Method and apparatus for executing distributed computing task
CN112732994B (en) Web page information extraction method, device, device and storage medium
CN108985367A (en) Computing engines selection method and more computing engines platforms based on this method
CN110908789B (en) Visual data configuration method and system for multi-source data processing
CN110674083A (en) Workflow migration method, apparatus, device, and computer-readable storage medium
CN108121742A (en) The generation method and device of user's disaggregated model
CN110968579A (en) Execution plan generation and execution method, database engine and storage medium
US12260349B2 (en) Interactive guidance system for selecting thermodynamics methods in process simulations
CN118444900A (en) Application program construction method based on low-code platform and low-code platform
CN114064601A (en) Storage process conversion method, device, equipment and storage medium
CN104239630B (en) A kind of emulation dispatch system of supportive test design
CN118035204A (en) Data blood edge display method, distributed task scheduling system and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant