MongoDB

MongoDB聚合方法：aggregate()

语法：db.collection_name.aggregate(AGGREGATE_OPERATION)

管道：MongoDB的聚合管道将文档在一个管道处理完毕的结果传递给下一个管道处理，管道操作是可以重复的

常用管道：

$group：按指定的表达式对文档进行分组

_id 指定为null或0，则是针对所有文档计算

# 计算所有文档的click_num的总和

db.test_study.aggregate([
    {
        $group: {
            _id: null,
            total: {
                $sum: "$click_num"
            }
        }
    }
])

# 输出结果

{
"_id": null,
"total": 2555
}

_id ：指定根据某个字段分组，计算每个组的总和

# 根据year分组，计算总和
db.test_study.aggregate([
    {
        $group: {
            _id: "$year",
            total: {
                $sum: "$click_num"
            }
        }
    }
])

# 输出结果
{
    "_id": 2019,
    "total": 1310
}

// 2
{
    "_id": 2020,
    "total": 1400
}

$project：

控制返回文档的结构，从文档中选择想要的域，可以重命名，增加或删除域
也可以通过管道表达式进行一些负责的操作，例如数学计算，日期操作，逻辑操作等

_id字段默认包含在输出文档中，要从输出文档中排除_id字段，则必须指定$project的 _id:0

# 事例数据
{
    "_id": ObjectId("5e5e05ef88f52a1c16bbff0f"),
    "url": "https://www.baidu.com/",
    "web_name": "百度",
    "click_num": 605,
    "year": 2019
}

 1 # 输出文档中除了web_name和click_num外，还包含默认的 _id 域
 2 db.test_study.aggregate([
 3     {
 4         $project: {
 5             web_name: "$web_name",
 6             click_num: "$click_num"
 7         }
 8     }
 9 ])
10 
11 
12 # 设置 _id:0 输出文档中只包括 web_name 和 click_num，不包含 _id 域
13 db.test_study.aggregate([
14     {
15         $project: {
16          _id:0,
17             web_name: "$web_name",
18             click_num: "$click_num"
19         }
20     }
21 ])

输出嵌套字段中的部分字段

# 事例数据
{
    "_id": 1,
    "title": "789",
    "author": {
        "last": "Li",
        "first": "Lucy",
        "country": "China"
    },
    "copies": 5,
    "lastModified": "2019-07-28"
}

需求：输出 title，author.country

db.test.aggregate([{
    $project: {
    _id:0,
        title: "$title",
        author_country: "$author.country"
    }
}])

排除_id、author.country、copies 其他字段均输出（事例数据如上）

db.test.aggregate([{
    $project: {
        "_id":0,
        "copies": 0,
        "author.country": 0
    }
}])

注意：

"copies" 要加双引号

使用 remove 来有条件的禁止一个字段

# 事例数据
{
    "_id": 1,
    "title": "789",
    "author": {
        "last": "Li",
        "first": "Lucy",
        "country": "China"
    },
    "copies": 5,
    "lastModified": "2019-07-28"
}

{
    "_id": 2,
    "title": "790",
    "author": {
        "last": "G",
        "first": "Huai",
        "country": "Japan"
    },
    "copies": "",
    "lastModified": "2012-08-07"
}

# 当copies=""时，排除这个字段
db.test.aggregate([{
    $project: {
        copies: {
            $cond: {
                if : {
                    $eq: ["", "$copies"]
                },
                then: "$$REMOVE",
                else : "$copies"
            }
        }
    }
}])

# 输出结果

// 1

{
"_id": 1,
"copies": 5
}

// 2
{
"_id": 2

$match：用于过滤数据，只输出符合条件的文档（多条件查询可参考 db.test.find() ）

事例数据

{
    "_id": ObjectId("5e5e05ef88f52a1c16bbff0f"),
    "url": "https://www.baidu.com/",
    "web_name": "百度",
    "click_num": 605,
    "year": 2019
}
{
    "_id": ObjectId("5e5e22a1b9534107347b2d53"),
    "url": "http://book.dangdang.com/2",
    "web_name": "博客园",
    "click_num": 600,
    "year": 2020
}

查询web_name为“百度”的文档的web_name、click_num、year域　

db.test_study.aggregate([
    {
        $match:{
                web_name:"百度"
                }
    },
        {
              $project:{
                _id:0,
                web_name:"$web_name",
                click_num:"$click_num",
                year:"$year"
                }
        }
])

输出结果　

{
    "web_name": "百度",
    "click_num": 605,
    "year": 2019
}

$limit：限制传递到管道中下一阶段的文档数
```
# 前2个文档进入下一阶段
{
    $limit: 2
}
```
$skip：跳过进入stage的指定数量的文档，并将其余文档传递到下一阶段
```
# 跳过前2个文档
{
     $skip: 2
}
```

$sort：对文档进行排序

{
    $sort: {
     　　click_num: -1
     }
}