datax 安装和使用

官网:https://github.com/alibaba/DataX

下载地址:http://datax-opensource.oss-cn-hangzhou.aliyuncs.com/datax.tar.gz

安装

1. 下载安装包:

wget http://datax-opensource.oss-cn-hangzhou.aliyuncs.com/datax.tar.gz 

2. 解压:

tar -zxvf datax.tar.gz 

3.测试:

cd datax
python2 bin/datax.py job/job.json

 安装完成!

案例

将数据mysql 的数据导到hdfs:

1.执行一下命令先获取模板:

/usr/bin/python2 bin/datax.py -r mysqlreader -w hdfswriter

2. 修改模板中的配置

jobs/mysql_to_hdfs.json 文件

{
    "job": {
        "content": [
            {
                "reader": {
                    "name": "mysqlreader", 
                    "parameter": {
                        "column": [
                        "name",
                        "age",
                        ], 
                        "connection": [
                            {
                                "jdbcUrl": ["jdbc:mysql://hadoop103:3306/deng"], 
                                "table": ["user"]
                            }
                        ], 
                        "password": "123456", 
                        "username": "root", 
                        "where": ""
                    }
                }, 
                "writer": {
                    "name": "hdfswriter", 
                    "parameter": {
                        "column": [
                        
                        {
                         "name":"name",
                         "type":"VARCHAR"
                        },
                        {
                         "name":"age",
                         "type":"INT"
                        }
                        ], 
                        "compress": "", 
                        "defaultFS": "hdfs://hadoop102:8020", 
                        "fieldDelimiter": "	", 
                        "fileName": "city.txt", 
                        "fileType": "text", 
                        "path": "/", 
                        "writeMode": "append"
                    }
                }
            }
        ], 
        "setting": {
            "speed": {
                "channel": 1
            }
        }
    }
}

将数据hdfs的数据导到mysql:

1. 获取模板

/usr/bin/python2 bin/datax.py -r hdfsreader -w mysqlwriter

2. jobs/hdfs_to_mysql.json 文件

{
    "job": {
        "content": [
            {
                "reader": {
                    "name": "hdfsreader", 
                    "parameter": {
                        "column": [
                            "*"
                        ], 
                        "defaultFS": "hdfs://hadoop102:8020", 
                        "encoding": "UTF-8", 
                        "fieldDelimiter": "	", 
                        "fileType": "text", 
                        "path": "/city.txt__0fc0fad9_a12a_4e9f_9720_a2f996c844c0"
                    }
                }, 
                "writer": {
                    "name": "mysqlwriter", 
                    "parameter": {
                        "column": [
                        "name",
                        "age"
                        ], 
                        "connection": [
                            {
                                "jdbcUrl": "jdbc:mysql://hadoop103:3306/deng",
                                "table": ["user_2"]
                            }
                        ], 
                        "password": "123456", 
                        "preSql": [], 
                        "session": [], 
                        "username": "root", 
                        "writeMode": "insert"
                    }
                }
            }
        ], 
        "setting": {
            "speed": {
                "channel": 1
            }
        }
    }
}
原文地址:https://www.cnblogs.com/knighterrant/p/15202884.html