programing hive 读书笔记

1、Schema on read

When you write data to a traditional database,either through loading extenal data,writing the output of a query,doing UPDATE staements,etc.,the database has total control over the storage.The database is the "gatekeeper". An important implication of this control is that the database can enforce the schema as data is written.This is called schema on write.

Hive has no such control over the underlying storage. There are many ways to create,
modify, and even damage the data that Hive will query. Therefore, Hive can only enforce
queries on read. This is called schema on read.

2、HiveQL

HiveQL is the Hive query language. Like all SQL dialects in widespread use, it doesn’t
fully conform to any particular revision of the ANSI SQL standard. It is perhaps closest
to MySQL’s dialect, but with significant differences. Hive offers no support for rowlevel
inserts, updates, and deletes. Hive doesn’t support transactions. Hive adds extensions
to provide better performance in the context of Hadoop and to integrate with
custom extensions and even external programs.

创建表：create database if not exists testdatabase; #语法同mysql

复制表：create table A like B; #创建表A,A是B的副本，语法同样适用于mysql。

Using the IN database_name clause and a regular expression for the table
names together is not supported.

DESCRIBE EXTENDED mydb.employees; #查看表的详细信息，mysql不支持EXTENDED;Replacing EXTENDED with FORMATTED provides more readable but also more verbose

output.