DataX案例:从stream流读取数据并打印到控制台

stream流读取数据并打印到控制台

1)查看配置模板

[jason@hadoop102 bin]$ python datax.py -r streamreader -w streamwriter

 

DataX (DATAX-OPENSOURCE-3.0), From Alibaba !

Copyright (C) 2010-2017, Alibaba Group. All Rights Reserved.

 

 

Please refer to the streamreader document:

     https://github.com/alibaba/DataX/blob/master/streamreader/doc/streamreader.md

 

Please refer to the streamwriter document:

     https://github.com/alibaba/DataX/blob/master/streamwriter/doc/streamwriter.md

 

Please save the following configuration as a json file and  use

     python {DATAX_HOME}/bin/datax.py {JSON_FILE_NAME}.json

to run the job.

 

{

    "job": {

        "content": [

            {

                "reader": {

                    "name": "streamreader",

                    "parameter": {

                        "column": [],

                        "sliceRecordCount": ""

                    }

                },

                "writer": {

                    "name": "streamwriter",

                    "parameter": {

                        "encoding": "",

                        "print": true

                    }

                }

            }

        ],

        "setting": {

            "speed": {

                "channel": ""

            }

        }

    }

}

2)根据模板编写配置文件

[jason@hadoop102 job]$ vim stream2stream.json

填写以下内容: 

{

  "job": {

    "content": [

      {

        "reader": {

          "name": "streamreader",

          "parameter": {

            "sliceRecordCount": 10,

            "column": [

              {

                "type": "long",

                "value": "10"

              },

              {

                "type": "string",

                "value": "hello,DataX"

              }

            ]

          }

        },

        "writer": {

          "name": "streamwriter",

          "parameter": {

            "encoding": "UTF-8",

            "print": true

          }

        }

      }

    ],

    "setting": {

      "speed": {

        "channel": 1

       }

    }

  }

}

3)运行

 

[jason@hadoop102 job]$  /opt/module/datax/bin/datax.py /opt/module/datax/job/stream2stream.json
原文地址:https://www.cnblogs.com/LIAOBO/p/13665320.html