Datax初步使用

一:下载

下载页面地址:https://github.com/alibaba/DataX 

在页面中【Quick Start】--->【Download DataX下载地址】进行下载。下载后的包名:datax.tar.gz。

解压后{datax}目录下有{bin conf job lib log log_perf plugin script tmp}几个目录。

二:安装

将下载后的压缩包直接解压后可用(我直接解压在d盘了)

 三:测试

进入datax目录下的bin中,里面有datax.py文件,可以在cmd中测试:

python D:dataxindatax.py D:dataxjobjob.json

如果运行结果乱码:在命令行输入:

CHCP 65001

 另外,如果你的python版本是3.x的话,则需要修改一下bin文件夹下的三个.py文件,链接地址为:https://github.com/HxYyWw/DatatX_python3/tree/master(同学的github地址)

四:应用

 csv文件写入mysql

{
    "job": {
        "setting": {
            "speed": {
                "channel": 3
            }
        },
        "content": [
            {
                 "reader": {
                    "name": "txtfilereader",
                    "parameter": {
                        "path":["D:\desktop\datax.csv"],
                        "encode":"gbk",
                        "column" : [
                            {
                                "index":0,
                                "type":"string"
                            },{
                                        "index":1,
                                "type":"Long"
                            }
                        ],
                        "fieldDelimiter":","
                    }
                },
                "writer": {
                    "name": "mysqlwriter",
                    "parameter": {
                        "writeMode": "insert",
                        "username": "root",
                        "password": "1234",
                        "column": [
                            "name",
                            "value"
                        ],
                        "preSql": [
                            "truncate table datax"
                        ],
                        "connection": [ 
                            {
                                "jdbcUrl": "jdbc:mysql://localhost:3306/test?useUnicode=true&characterEncoding=utf-8",
                                "table": [
                                    "datax"
                                ]
                            }
                        ]
                    }
                }
            }
        ]
    }
}

  运行结果:

 配置一个从Mysql数据库同步抽取数据到本地的作业:

{
    "job": {
        "setting": {
            "speed": {
                 "channel": 3
            },
            "errorLimit": {
                "record": 0,
                "percentage": 0.02
            }
        },
        "content": [
            {
                "reader": {
                    "name": "mysqlreader",
                    "parameter": {
                        "username": "root",
                        "password": "1234",
                        "column": [
        "id",
                            "name",
                            "value"
                        ],
                        "splitPk": "id",
                        "connection": [
                            {
                                "table": [
                                    "datax"
                                ],
                                "jdbcUrl": [
     "jdbc:mysql://localhost:3306/test?useUnicode=true&characterEncoding=utf-8"
                                ]
                            }
                        ]
                    }
                },
               "writer": {
                    "name": "streamwriter",
                    "parameter": {
                        "print":true
                    }
                }
            }
        ]
    }
}

如果连接数据库失败了,可能是没有jar包,在lib里放上jar包就好了。

具体参照https://github.com/alibaba/DataX/blob/master/mysqlreader/doc/mysqlreader.md

原文地址:https://www.cnblogs.com/zmh-980509/p/12409571.html