|
1 | 1 | # python_data_lineage
|
2 | 2 | Data lineage tools in python
|
3 | 3 |
|
4 |
| - cd widget |
5 |
| - python -m http.server 8000 |
6 |
| - http://localhost:8000/ |
| 4 | +## step 1 环境准备 |
| 5 | + * 安装python3 |
| 6 | + * 安装 java jdk1.8 |
| 7 | + |
| 8 | +## step 2 打开web服务 |
| 9 | + 切换到本项目widget目录,执行以下命令启动web服务: |
| 10 | + |
| 11 | + `python -m http.server 8000` |
| 12 | + |
| 13 | + 浏览器内打开以下网址验证是否启动成功:http://localhost:8000/ |
| 14 | + |
| 15 | + 注意:如果要修改8000端口,需要同时在dlineage.py里修改widget_server_url |
| 16 | + |
| 17 | +## step 3 执行python脚本 |
| 18 | + 切换到本项目根目录,即dlineage.py所在目录,执行以下命令: |
| 19 | + |
| 20 | + `python dlineage.py /f test.sql /graph` |
| 21 | + |
| 22 | + 此命令,会将test.sql进行血缘分析,并打开一个浏览器页面,图形化方式展示血缘分析结果。 |
| 23 | + |
| 24 | + dlineage.py 支持的命令参数说明: |
| 25 | + |
| 26 | + /f: Optional, the full path to SQL file. |
| 27 | + |
| 28 | + /d: Optional, the full path to the directory includes the SQL files. |
| 29 | + |
| 30 | + /j: Optional, return the result including the join relation. |
| 31 | + |
| 32 | + /s: Optional, simple output, ignore the intermediate results. |
| 33 | + |
| 34 | + /topselectlist: Optional, simple output with top select results. |
| 35 | + |
| 36 | + /withTemporaryTable: Optional, simple output with the temporary tables. |
| 37 | + |
| 38 | + /i: Optional, the same as /s option, but will keep the resultset generated by the SQL function, this parameter will have the same effect as /s /topselectlist + keep resultset generated by the sql function. |
| 39 | + |
| 40 | + /showResultSetTypes: Optional, simple output with specify resultset types, separate with commas, resultset types contains array, struct, result_of, cte, insert_select, update_select, merge_update, merge_insert, output, update_set pivot_table, unpivot_table, alias, rs, function, case_when |
| 41 | + |
| 42 | + /if: Optional, keep all the intermediate resultset, but remove the resultset generated by the SQL function |
| 43 | + |
| 44 | + /ic: Optional, ignore the coordinates in the output. |
| 45 | + |
| 46 | + /lof: Option, link orphan column to the first table. |
| 47 | + |
| 48 | + /traceView: Optional, only output the name of source tables and views, ignore all intermedidate data. |
| 49 | + |
| 50 | + /text: Optional, this option is valid only /s is used, output the column dependency in text mode. |
| 51 | + |
| 52 | + /json: Optional, print the json format output. |
| 53 | + |
| 54 | + /tableLineage [/csv /delimiter]: Optional, output tabel level lineage. |
| 55 | + |
| 56 | + /csv: Optional, output column level lineage in csv format. |
| 57 | + |
| 58 | + /delimiter: Optional, the delimiter of output column level lineage in csv format. |
| 59 | + |
| 60 | + /t: Option, set the database type. |
| 61 | + Support access,bigquery,couchbase,dax,db2,greenplum,hana,hive,impala,informix,mdx,mssql, |
| 62 | + sqlserver,mysql,netezza,odbc,openedge,oracle,postgresql,postgres,redshift,snowflake, |
| 63 | + sybase,teradata,soql,vertica the default value is oracle |
| 64 | + |
| 65 | + /env: Optional, specify a metadata.json to get the database metadata information. |
| 66 | + |
| 67 | + /transform: Optional, output the relation transform code. |
| 68 | + |
| 69 | + /coor: Optional, output the relation transform coordinate, but not the code. |
| 70 | + |
| 71 | + /defaultDatabase: Optional, specify the default schema. |
| 72 | + |
| 73 | + /defaultSchema: Optional, specify the default schema. |
| 74 | + |
| 75 | + /showImplicitSchema: Optional, show implicit schema. |
| 76 | + |
| 77 | + /showConstant: Optional, show constant table. |
| 78 | + |
| 79 | + /treatArgumentsInCountFunctionAsDirectDataflow: Optional, treat arguments in count function as direct dataflow. |
| 80 | + |
| 81 | + /filterRelationTypes: Optional, support fdd, fdr, join, call, er, multiple relatoin types separated by commas |
| 82 | + |
| 83 | + /graph 打开一个浏览器页面,图形化方式展示血缘分析结果 |
0 commit comments