Pig filter用法举例

filter:过滤数据,只有符合特定条件的数据才会被保留下来,然后进入下一个数据流。

 
1)等值比较
filter data by $0 == 1
filter data by $0 != 1
 
2)字符串 正则匹配  JAVA的正则表达式
字符串以CM开头
filter data by $0 matches 'CM.*';

字符串包含CM

filter data by $0 matches '.*CM.*';

 

3)not
filter data by not $0==1;
filter data by not $0 matches '.*CM.*';

   

4)NULL处理
filter data by $0 is not null;

   

5)UDF
filter data by isValidate($0);

   

6)and or
filter data by $0!=1 and $1>10

  

原文地址:https://www.cnblogs.com/lishouguang/p/4559300.html