一:下载
下载页面地址:https://github.com/alibaba/DataX
在页面中【Quick Start】--->【Download DataX下载地址】进行下载。下载后的包名:datax.tar.gz。
解压后{datax}目录下有{bin conf job lib log log_perf plugin script tmp}几个目录。
二:安装
将下载后的压缩包直接解压后可用(我直接解压在d盘了)
三:测试
进入datax目录下的bin中,里面有datax.py文件,可以在cmd中测试:
python D:dataxindatax.py D:dataxjobjob.json
如果运行结果乱码:在命令行输入:
CHCP 65001
另外,如果你的python版本是3.x的话,则需要修改一下bin文件夹下的三个.py文件,链接地址为:https://github.com/HxYyWw/DatatX_python3/tree/master(同学的github地址)
四:应用
csv文件写入mysql
{ "job": { "setting": { "speed": { "channel": 3 } }, "content": [ { "reader": { "name": "txtfilereader", "parameter": { "path":["D:\desktop\datax.csv"], "encode":"gbk", "column" : [ { "index":0, "type":"string" },{ "index":1, "type":"Long" } ], "fieldDelimiter":"," } }, "writer": { "name": "mysqlwriter", "parameter": { "writeMode": "insert", "username": "root", "password": "1234", "column": [ "name", "value" ], "preSql": [ "truncate table datax" ], "connection": [ { "jdbcUrl": "jdbc:mysql://localhost:3306/test?useUnicode=true&characterEncoding=utf-8", "table": [ "datax" ] } ] } } } ] } }
运行结果:
配置一个从Mysql数据库同步抽取数据到本地的作业:
{ "job": { "setting": { "speed": { "channel": 3 }, "errorLimit": { "record": 0, "percentage": 0.02 } }, "content": [ { "reader": { "name": "mysqlreader", "parameter": { "username": "root", "password": "1234", "column": [ "id", "name", "value" ], "splitPk": "id", "connection": [ { "table": [ "datax" ], "jdbcUrl": [ "jdbc:mysql://localhost:3306/test?useUnicode=true&characterEncoding=utf-8" ] } ] } }, "writer": { "name": "streamwriter", "parameter": { "print":true } } } ] } }
如果连接数据库失败了,可能是没有jar包,在lib里放上jar包就好了。
具体参照https://github.com/alibaba/DataX/blob/master/mysqlreader/doc/mysqlreader.md