Window Operations详解


window(windowLength, slideInterval):返回窗口长度为windowLength,每隔slideInterval滑动一次的window DStream

countByWindow(windowLength, slideInterval):返回窗口中元素的个数

reduceByWindow(func, windowLength, slideInterval):对window中的元素做reduce操作

// x, y 是window中的元素
val ds1 = wordCounts.reduceByWindow((x, y) => {
    println(x)
    println(y)
    x
}, Seconds(30), Seconds(20)) 

reduceByKeyAndWindow(func, windowLength, slideInterval, [numTasks]) 针对window内的数据做reduceByKey

// x y 是相同key的value 
wordCounts.reduceByKeyAndWindow((x: Int, y:Int) => x + y, Seconds(30), Seconds(20)) 

reduceByKeyAndWindow(func, invFunc, windowLength, slideInterval, [numTasks]): invFunc:假设invFunc的参数为x和y,那么x是上个window经过func操作后的结果,y为此次window与上次window在时间上交叉的元素经过func操作后结果

sc.setCheckpointDir("D://checkpoints/")
// m是上个window key相同的元素的reduceByKeyAndWindow第一个参数操作后的结果,n为上个window与当前window在时间上不重复的key相同的元素的reduceByKeyAndWindow第一个参数操作后的结果
val ds1 = wordCounts.reduceByKeyAndWindow((x, y) => x + y, (m, n) => { m - n}, Seconds(10), Seconds(10))

  

// 这个方法的作用和<strong>reduceByKeyAndWindow(func, windowLength, slideInterval, [numTasks])相同
wordCounts.reduceByKeyAndWindow((x, y) => x + y, (x, y) => x - y, Seconds(10), Seconds(10))

  

countByValueAndWindow(windowLength, slideInterval, [numTasks]):window中key出现的次数

原文地址:https://www.cnblogs.com/heml/p/6781740.html