kudu + spark + impala +hadoop

一、spark 2.2 需要Kudu 1.5.0以上版本

Spark Integration Known Issues and Limitations

Spark 2.2+ requires Java 8 at runtime even though Kudu Spark 2.x integration is Java 7 compatible. Spark 2.2 is the default dependency version as of Kudu 1.5.0.

Kudu tables with a name containing upper case or non-ascii characters must be assigned an alternate name when registered as a temporary table.

Kudu tables with a column name containing upper case or non-ascii characters may not be used with SparkSQL. Columns may be renamed in Kudu to work around this issue.

<> and OR predicates are not pushed to Kudu, and instead will be evaluated by the Spark task. Only LIKE predicates with a suffix wildcard are pushed to Kudu, meaning that LIKE "FOO%" is pushed down but LIKE "FOO%BAR" isn’t.

Kudu does not support every type supported by Spark SQL. For example, Date and complex types are not supported.

Kudu tables may only be registered as temporary tables in SparkSQL. Kudu tables may not be queried using HiveContext.

二、Impala兼容Apache Impala 2.8.0 版本

Requirements
This documentation is specific to the certain versions of Impala. The syntax described will work only in the following releases:

The version of Impala 2.7.0 that ships with CDH 5.10. SELECT VERSION() will report impalad version 2.7.0-cdh5.10.0.

Apache Impala 2.8.0 releases compiled from source. SELECT VERSION() will report impalad version 2.8.0.

Older versions of Impala 2.7 (including the special IMPALA_KUDU releases previously available) have incompatible syntax. Future versions are likely to be compatible with this syntax, but we recommend checking that this is the latest available documentation corresponding to the appropriate version you have installed