Mongodb利用aggregation实现抽样查询(按记录数和时间)

  之前对mongodb不熟,但是项目要用,因为数据量比较大,并且领导要实现抽样查询,控制数据流量,所以自己研究了下,亲测可用,分享一下!

  话不多说,上代码:

  第一种方案:加自增主键,实现按记录数抽样  

  1、记录在存入数据库时不适用默认id,改为自增id,具体实现如下:

  

/**
 * 序列类,用于映射查询的序列值
 * @author Administrator
 *
 */
public class MongoSequence {
    @Id
    private String id;
    private int seq;
 
}

  

/**
 * 获取序列号工具类
 * @author Administrator
 *
 */
@Component
public class MongoAutoidUtil {

    @Autowired
    MongoTemplate mongoTemplate;

    public int getNextSequence(String collectionName) {
        Query query = new Query(Criteria.where("collName").is(collectionName));
        Update update = new Update();
        update.inc("seq", 1);
        FindAndModifyOptions options = new FindAndModifyOptions();
        options.upsert(true);
        options.returnNew(true);
        MongoSequence seqId = mongoTemplate.findAndModify(query, update, options, MongoSequence.class);
        return seqId.getSeq();

    }

}
//插入数据
public void insert(DeviceData110 de) {
        de.setId(mongoAutoidUtil.getNextSequence(de.getParamName()));        
        mongoTemplate.save(de,de.getParamName());
    }

  2、查询数据,具体实现如下:

    @Autowired
    private MongoTemplate mongoTemplate;

    public  List<DeviceData110> find() {
             ProjectionOperation dateProjection = Aggregation.project("_id","paramName","retrieveTime");
             MatchOperation  match1 = Aggregation.match(new Criteria("paramName").is("aaa"));
             MatchOperation  match2 = Aggregation.match(new Criteria("retrieveTime").gte(DateUtil.getAssignTime(new Date(), -1)).lte(DateUtil.getAssignTime(new Date(), 1)));
             MatchOperation  match3 = Aggregation.match(new Criteria("_id").mod(2, 0));
             Aggregation agg = Aggregation.newAggregation(dateProjection,match1,match2,match3);
            AggregationResults<DeviceData110> results = mongoTemplate.aggregate(agg,"aaa",DeviceData110.class);
            List<DeviceData110> list = results.getMappedResults();
            return list;
    }

  

//实体类
public class DeviceData110 implements Serializable{

    private static final long serialVersionUID = -4763630558724084819L;
    public int id;
    public String paramName;
    public Date retrieveTime;   

}

  在demo中可以查询到按paramName为"aaa",retrieveTime为一天前至今,并且id值除以2余数为0的所有记录,更改除数的大小便实现了不同粒度的抽样查询。

  第二种方案:借助 date aggragation ,实现按时间查询

  前提是被查询数据中有字段为iso date类型retrieveTime,然后在aggregation中加入一个这样的MatchOperation,最后加入到Aggregation.newAggregation()即可实现查询分钟数为0,15,30,45的记录,同时支持的其它操作还有hour、seconds等。

ProjectionOperation project1 =  Aggregation.project("_id").andExpression("minute(retrieveTime)").as("minute"),
MatchOperation match = Aggregation.match(new Criteria("minute").in("0","15","30","45"));
Aggregation agg = Aggregation.newAggregation(project1 、match );

  

  

原文地址:https://www.cnblogs.com/hhhshct/p/8426068.html