[Hive

Virtual Columns

Hive 0.8.0 provides support for two virtual columns:

One is INPUT__FILE__NAME, which is the input file's name for a mapper task.

the other is BLOCK__OFFSET__INSIDE__FILE, which is the current global file position.

For block compressed file, it is the current block's file offset, which is the current block's first byte's file offset.

Simple Examples

select INPUT__FILE__NAME, key, BLOCK__OFFSET__INSIDE__FILE from src;

select key, count(INPUT__FILE__NAME) from src group by key order by key;

select * from src where BLOCK__OFFSET__INSIDE__FILE > 12000 order by key;

谨言慎行,专注思考 , 工作与生活同乐
原文地址:https://www.cnblogs.com/tmeily/p/4249954.html