Esper系列(三)Context和Group by

Context

把不同的事件按照框的规则框起来（规则框在partition by中定义），并且有可能有多个框，而框与框之间不会互相影响。

功能：

组合事件查询并进行分组，类型：Hash Context、Category Context、Non-Overlapping Context。

格式：

            1            create context context_name partition [by] event_property [and event_property [and ...]]        
         2            from stream_def [, event_property [...]        
         3            from stream_def] [, ...]        
 

例子：

            1            // 创建context       
         2            create context ctEvent partition by name from orderEvetn;       
         3                    
         4            // 统计最近两秒内各context区间中事件salary值的总和       
         5            String epsql = "context ctEvent select sum(salary) as result from orderEvent.win:time(2 sec)";       
 

            1            // 当context各区间中存在事件个数非0且为2的整数倍时，统计该区间事件属性最近两个事件salary属性的总和，这里2由win:length_batch(2)指定；       
         2            String epsql = "context ctEvent select sum(salary) as result from orderEvent.win:length_batch(2)";       
         3                    
 

说明:

win:length_batch(num)：该标识限定事件个数必须非0且为num的整数倍时就触发监听，并对最近num个事件进行对应的处理。

win:length(num):该标识作用于最近的num个事件，没有达到num个数监听也会被触发。

win:length(num)与win:length_batch(num)相比后者相当于在前者的基础上增加了一个批处理设置，num个数达到后才进行处理。

win:time(time)：该标识处理最近的time时间内的事件。

win:time_batch(time)：该标识符与win:length_batch(num)类似不过这里是时间。

作用：

根据定义context中指定的属性对事件流中属性值进行分类，结合EPL语句运行时将针对分类的结果进行分析运算。

多个事件流的context, 每个流的中用于context的属性的数量要一样，数据类型也要一致。

Context各属性字段描述表

Name：Context名称；

ID：引擎自动为context分配从0开始依次递增，context作用的事件流,同一属性类型的事件流ID值相同；

Key1~keyN：分别对应context作用于事件流的各个属性字段；

Group by

格式：

            1            group by aggregate_free_expression [, aggregate_free_expression] [, ...]；       
 

注意：group by后面不能包含聚合函数，也不能是select子句中聚合函数修饰的属性名；

功能:

group by的对象就是一个值(属性字段也是值)，以相同的值进行分组；

使用Group by的时候，会遇到分组数量太多的情况。比如以时间单位进行分组，那么内存使用一定是一个大问题。因此@Hint为其设计了两个属性，用于限制Group by的生存时间，使虚拟机能及时回收内存。这两个属性分别为reclaim_group_aged和reclaim_group_freq;

            1            // 以name进行分组，对最近3 秒内的事件取salary属性的平均值，@Hint('reclaim_group_aged=1')表示在1秒内对没有更新数据的分组进行回收, reclaim_group_aged后面值的单位是秒;       
         2            String epsql = "@Hint('reclaim_group_aged=1')select avg(salary) as result from orderEvent.win:time(3 sec) group by name";       
 

            1            // 以name进行分组，对最近3 秒内的事件取salary属性的平均值，@Hint('reclaim_group_aged=2,reclaim_group_freq=1')表示对两秒内对没有更新数据的分组进行回收,每1秒查探回收一次（防止数据量大的时候内存溢出）       
         2            String epsql = "@Hint('reclaim_group_aged=2,reclaim_group_freq=1')select avg(salary) as result from orderEvent.win:time(3) group by name";       
 

Having

与SQL作用类似，where子句中不能包含聚合函数，而Having的作用就是针对这种应用的处理。

1	create context context_name partition [by] event_property [and event_property [and ...]]
2	from stream_def [, event_property [...]
3	from stream_def] [, ...]

1	// 创建context
2	create context ctEvent partition by name from orderEvetn;
3
4	// 统计最近两秒内各context区间中事件salary值的总和
5	String epsql = "*context* *ctEvent* *select* *sum(salary)* as *result* *from* *orderEvent.win:time(2* *sec)*";

1	// 当context各区间中存在事件个数非0且为2的整数倍时，统计该区间事件属性最近两个事件salary属性的总和，这里2由win:length_batch(2)指定；
2	String epsql = "*context* *ctEvent* *select* *sum(salary)* as *result* *from* *orderEvent.win:length_batch(2)*";
3

1	// 以name进行分组，对最近3 秒内的事件取salary属性的平均值，@Hint('reclaim_group_aged=1')表示在1秒内对没有更新数据的分组进行回收, reclaim_group_aged后面值的单位是秒;
2	String epsql = "@Hint('reclaim_group_aged=1')select avg(salary) as result from orderEvent.win:time(3 sec) group by name";