Pig foreach用法举例

foreach:一行一行的遍历数据,处理一行的数据,然后返回一个tuple。

users = load '/users.data';
 
1)别名引用
f = foreach users generate name, age;

   

2)位置引用
f = foreach users generate $0, $1;

   

3)字段区间 ..
从name字段开始到最后一个字段
f = foreach users generate name..

  

从第一个字段开始到age字段
f = foreach users generate ..age;

  

从name字段开始开始到age字段
f = foreach users generate name..age;

   

4)* 代表所有字段
foreach users generate *;

   

5)+-*/%等数学运算
foreach data generate $1-$0;

   

6)三目运算符 ? :
f = foreach users generate name, age>18?1:0;

   

7)map引用
data = load 'data' as (name:chararray, team:chararray, bat:map[]);
f = foreach data generate bat#'key1';

   

8)tuple引用
data = load 'data' as (name:chararray, team:chararray, bat:tuple(x:int, y:int));
f = foreach data generate bat.x;

   

9)bag引用
data = load 'data' as (name:chararray, team:charray, bat:bag{t:tuple(x:int, y:int)});
f = foreach data generate bat.x;

引用多个字段

f = foreach data generate bat.(x, y);

   

10)引用UDF
upped = foreach data generate UPPER($0) as a, $1 as b;
grped = group data by a;
sums = foreach grped generate group, SUM(b);

  

 
 
 
 
原文地址:https://www.cnblogs.com/lishouguang/p/4559295.html