2.3 Hive的数据类型讲解及实际项目中如何使用python脚本对数据进行ETL

一、hive Data Types

https://cwiki. apache. org/confluence/display/HiveLanguageManual+Types

Numeric Types
        · TINYINT(1-byte signed integer, from-128 to 127)
        · SMALLINT(2-byte signed integer, from-32,768 to 32,767)
        · INT(4-byte signed integer, from-2,147,483,648 to 2,147,483,647)
        · BIGINT(8-byte signed integer, from-9,223,372,036,854,775,808 to9
        · FLOAT(4-byte single precision floating point number)
        · DOUBLE(8-byte double precision floating point number)
        · DECIMAL
                · Introduced in Hive 0.11.0 with a precision of 38 digits
                · Hive 0.13.0 introduced user definable precision and scale


Date/Time Types
        · TIMESTAMP(Note: Only available starting with Hive 0.8.0)
        · DATE(Note: Only available starting with Hive 0.12.0)


String Types
       · STRING
    · VARCHAR(Note: Only available starting with Hive 0.12.0)
    · CHAR(Note: Only available starting with Hive 0.13.0)


Misc Types
    · BOOLEAN
    · BINARY(Note: Only available starting with Hive 0.8.0)



Complex Types
    · arrays: ARRAY<data_type>(Note: negative values and non-constant expressions are allowed as of Hive 0.14.)
    · maps: MAP<primitivetype, data_type>(Note: negative values and non-constant expressions are allowed as of Hive 0.14.)
    · structs: STRUCT<col_name: datatype [ COMENT col_comment],..>
    · union: UNIONTYPE<datatype, data_type,..>(Note: Only available starting with Hive 0.7.0.)


二、Primitive Types

·Types are associated with the columns in the tables.The following Primitive types are
supported:

·Integers
    ·TINYINT-1 byte integer
    ·SMALLINT-2 byte integer
    ·INT-4 byte integer
    ·BIGINT-8 byte integer


·Boolean type
    ·BOOLEAN-TRUE/FALSE


·Floating point numbers
    ·FLOAT-single precision
    ·DOUBLE-Double precision


·String type
    ·STRING-sequence of characters in a specified character set


https://cwiki.apache.org/confluence/display/Hive/Tutorial


三、python脚本对数据进行ETL流程

1)table, load           E

2)select, python     T

3)sub table             L

原文地址:https://www.cnblogs.com/weiyiming007/p/10750144.html