CentOS7.4+MongoBD3.6.4集群(Shard)部署以及大数据量入库



前言


      mongodb支持自动分片,集群自动的切分数据,做负载均衡。避免上面的分片管理难度。mongodb分片是将集合切合成小块,分散到若干片里面,每个片负责所有数据的一部分。这些块对应用程序来说是透明的,不需要知道哪些数据分布到哪些片上,甚至不在乎是否有做过分片,应用程序连接mongos进程,mongos知道数据和片的对应关系,将客户端请求转发到正确的片上,如果请求有了响应,mongos将结果收集起来返回给客户端程序。


      分片适用场景:

1)服务器磁盘不够用

2)单个mongod不能满足日益频繁写请求

3)将大量数据存放于内存中提高性能


      建立分片需要三种角色:

1.shard server
      保存实际数据容器。每个shard可以是单个的mongod实例,也可以是复制集,即使片内又多台服务器,只能有一个主服务器,其他的保存相同数据的副本。为了实现每个shard内部的auto-failover,强烈建议为每个shard配置一组Replica Set。


2.config server

      为了将一个特定的collection 存储在多个shard 中,需要为该collection 指定一个shardkey,shardkey 可以决定该条记录属于哪个chunk。Config Servers 就是用来存储:所有shard 节点的配置信息、每个chunk 的shardkey 范围、chunk 在各shard 的分布情况、该集群中所有DB 和collection 的sharding 配置信息。


3.route server
      集群前端路由,路由所有请求,然后将结果聚合。客户端由此接入,询问config server需要到哪些shard上查询或保存数据,再连接到相应的shard进行操作,最后将结果返回给客户端。客户端只需要将原先发送给mongod的请求原封不动的发给mongos(即route server)即可,不必知道数据分布在哪个shard上。



      shard key:设置分片时,需要从集合中选一个键,作为数据拆分的依据,这个键就是shard key。shard key的选择决定了插入操作在片之间的分布。shard key保证足够的不一致性,数据才能更好的分布到多台服务器上。同时保持块在一个合理的规模是非常重要的,这样数据平衡和移动块不会消耗大量的资源。

 

1.  参考说明

参考文档:

 

https://www.cnblogs.com/ityouknow/p/7566682.html

 

https://docs.mongodb.com/manual/tutorial/deploy-shard-cluster/

 

 

2.  安装环境说明

2.1.  环境说明

 

 

服务器

主机名

server1.smartmap.com

server2.smartmap.com

server3.smartmap.com

server4.smartmap.com

IP

192.168.1.31

192.168.1.32

192.168.1.33

192.168.1.34

Subnet mask

255.255.255.0

255.255.255.0

255.255.255.0

255.255.255.0

Gateway

192.168.1.1

192.168.1.1

192.168.1.1

192.168.1.1

DNS

218.30.19.50

61.134.1.5

218.30.19.50

61.134.1.5

218.30.19.50

61.134.1.5

218.30.19.50

61.134.1.5

 

运行服务

 

Config

Config

Config

Route

 

Shard1

Shard1

Shard1

Shard2

 

Shard2

Shard2

Shard3

Shard3

 

Shard3

Shard4

Shard4

Shard4

 

Route

 

 

 

 

服务

端口

Route

20000

Config

21000

Shard1

27001

Shard2

27002

Shard3

27003

Shard4

27004

 

2.2.  安装基础软件

 

[root@server1~]# yum install unzip wget ntp

 

2.3.  安装与配置NTP

[root@server1 ~]# yum update

[root@server1 ~]# yum install unzip wget ntp

[root@server1 ~]# systemctl is-enabled ntpd

disabled

[root@server1 ~]# systemctl enable ntpd

Created symlink from /etc/systemd/system/multi-user.target.wants/ntpd.service to /usr/lib/systemd/system/ntpd.service.

[root@server1 ~]# systemctl start ntpd

[root@server1 ~]# ntpdate -u 2.asia.pool.ntp.org

 

 

2.4.  添加用户

 

[root@server1 ~]# useradd mongodb

[root@server1 ~]# passwd mongodb

[root@server1 ~]# chmod u+w /etc/sudoers

[root@server1 ~]#

[root@server1 ~]# vi /etc/sudoers

添加如下内容:

 

mongodb  ALL=(ALL)       NOPASSWD: ALL

 

 

3.  安装

 

 

3.1.  MongoDB下载

 

https://www.mongodb.com/download-center?jmp=nav#community

 

clip_image002

 

[root@server1 ~]# mkdir /opt/mongodb

[root@server1 ~]# chown -R mongodb:mongodb /opt/mongodb/

 

3.2.  MongoDB解压

[root@server1 ~]# su – mongodb

[mongodb@server1 mongodb]$ cd /opt/mongodb

[mongodb@server1 mongodb]$ tar -zxvf mongodb-linux-x86_64-rhel70-3.6.4.tgz

[mongodb@server1 mongodb]$ mv mongodb-linux-x86_64-rhel70-3.6.4 mongodb-app

 

3.3.  创建相关目录

在所有的服务器上建立confmongosconfigshard1shard2shard3shard4、目录,因为mongos不存储数据,只需要建立日志文件目录即可

[mongodb@server1 mongodb]$ mkdir -p /opt/mongodb/mongodb-app/conf

 

[mongodb@server1 mongodb]$ mkdir -p /opt/mongodb/mongodb-app/data/mongos/log

 

[mongodb@server1 mongodb]$ mkdir -p /opt/mongodb/mongodb-app/data/config/data

[mongodb@server1 mongodb]$ mkdir -p /opt/mongodb/mongodb-app/data/config/log

 

[mongodb@server1 mongodb]$ mkdir -p /opt/mongodb/mongodb-app/data/shard1/data

[mongodb@server1 mongodb]$ mkdir -p /opt/mongodb/mongodb-app/data/shard1/log

[mongodb@server1 mongodb]$ mkdir -p /opt/mongodb/mongodb-app/data/shard2/data

[mongodb@server1 mongodb]$ mkdir -p /opt/mongodb/mongodb-app/data/shard2/log

[mongodb@server1 mongodb]$ mkdir -p /opt/mongodb/mongodb-app/data/shard3/data

[mongodb@server1 mongodb]$ mkdir -p /opt/mongodb/mongodb-app/data/shard3/log

[mongodb@server1 mongodb]$ mkdir -p /opt/mongodb/mongodb-app/data/shard4/data

[mongodb@server1 mongodb]$ mkdir -p /opt/mongodb/mongodb-app/data/shard4/log

 

3.4.  复制到其它节点

 

[mongodb@server2 ~]$ scp -r mongodb@192.168.1.31:/opt/mongodb/mongodb-app /opt/mongodb/

[mongodb@server3 ~]$ scp -r mongodb@192.168.1.31:/opt/mongodb/mongodb-app /opt/mongodb/

[mongodb@server4 ~]$ scp -r mongodb@192.168.1.31:/opt/mongodb/mongodb-app /opt/mongodb/

 

 

3.5.  环境变量

 

[mongodb@server1 mongodb]$ sudo vi /etc/profile

 

export MONGODB_HOME=/opt/mongodb/mongodb-app

export PATH=$MONGODB_HOME/bin:$PATH

 

[mongodb@server1 mongodb]$ source /etc/profile

[mongodb@server1 mongodb]$ mongod -v

 

4.  配置

 

4.1.  config server配置服务器

4.1.1.  创建配置文件

服务器313233上配置以下内容

 

[mongodb@server1 mongodb]$ vi /opt/mongodb/mongodb-app/conf/config.conf

 

 

## content

systemLog:

  destination: file

  logAppend: true

  path: /opt/mongodb/mongodb-app/data/config/log/config.log

 

# Where and how to store data.

storage:

  dbPath: /opt/mongodb/mongodb-app/data/config/data

  journal:

    enabled: true

# how the process runs

processManagement:

  fork: true

  pidFilePath: /opt/mongodb/mongodb-app/data/config/log/configsrv.pid

 

# network interfaces

net:

  port: 21000

  bindIp: 0.0.0.0

 

#operationProfiling:

replication:

  replSetName: config

 

sharding:

  clusterRole: configsvr

 

4.1.2.  启动三台服务器的config server

 

[mongodb@server1 mongodb]$ mongod --config /opt/mongodb/mongodb-app/conf/config.conf

about to fork child process, waiting until server is ready for connections.

forked process: 8072

child process started successfully, parent exiting

[mongodb@server1 mongodb]$

 

[mongodb@server1 mongodb]$ mongod --shutdown --config /opt/mongodb/mongodb-app/conf/config.conf

 

4.1.3.  登录任意一台配置服务器,初始化配置副本集

 

[mongodb@server1 conf]$ mongo 192.168.1.31:21000

MongoDB shell version v3.6.4

connecting to: mongodb://192.168.1.31:21000/test

MongoDB server version: 3.6.4

> use admin;

switched to db admin

> config = {

   _id : "config",

     members : [

         {_id : 0, host : "192.168.1.31:21000" },

         {_id : 1, host : "192.168.1.32:21000" },

   {_id : 2, host : "192.168.1.33:21000" }

     ]

 }

 

> rs.initiate(config);

config:SECONDARY> rs.status();

 

 

登录

mongo 192.168.1.31:21000

 

#切换数据库

use admin;

 

#定义配置变量

config = {

   _id : "config",

   members : [

        {_id : 0, host : "192.168.1.31:21000" },

        {_id : 1, host : "192.168.1.32:21000" },

            {_id : 2, host : "192.168.1.33:21000" }

    ]

}

 

#初始化副本集

rs.initiate(config);

 

#查看分区状态

rs.status();

 

4.2.  shard server配置服务器

 

4.2.1.  shard1配置

 

4.2.1.1.创建配置文件

服务器313233上配置以下内容

 

[mongodb@server1 mongodb]$ vi /opt/mongodb/mongodb-app/conf/shard1.conf

 

 

# shard1 config

systemLog:

  destination: file

  logAppend: true

  path: /opt/mongodb/mongodb-app/data/shard1/log/shard1.log

 

# Where and how to store data.

storage:

  dbPath: /opt/mongodb/mongodb-app/data/shard1/data

  journal:

    enabled: true

  wiredTiger:

    engineConfig:

      cacheSizeGB: 20

 

# how the process runs

processManagement:

  fork: true

  pidFilePath: /opt/mongodb/mongodb-app/data/shard1/log/shard1.pid

 

# network interfaces

net:

  port: 27001

  bindIp: 0.0.0.0

 

#operationProfiling:

replication:

  replSetName: shard1

sharding:

  clusterRole: shardsvr

 

4.2.1.2.启动三台服务器的shard1 server

 

[mongodb@server1 mongodb]$ mongod  --config  /opt/mongodb/mongodb-app/conf/shard1.conf

about to fork child process, waiting until server is ready for connections.

forked process: 8420

child process started successfully, parent exiting

[mongodb@server1 mongodb]$

 

[mongodb@server1 mongodb]$ mongod --shutdown --config /opt/mongodb/mongodb-app/conf/shard1.conf

killing process with pid: 8420

 

4.2.1.3.登录任意一台配置服务器,初始化配置副本集

 

[mongodb@server1 mongodb]$ mongo 192.168.1.31:27001

MongoDB shell version v3.6.4

connecting to: mongodb://192.168.1.31:27001/test

MongoDB server version: 3.6.4

 

> use admin;

switched to db admin

> config = {

     _id : "shard1",

      members : [

          {_id : 0, host : "192.168.1.31:27001" },

          {_id : 1, host : "192.168.1.32:27001" },

          {_id : 2, host : "192.168.1.33:27001" }

      ]

 }

 

> rs.initiate(config);

{ "ok" : 1 }

shard1:SECONDARY> rs.status();

 

 

4.2.2.  shard2配置

 

4.2.2.1.创建配置文件

服务器343132上配置以下内容

 

[mongodb@server4 mongodb]$ vi /opt/mongodb/mongodb-app/conf/shard2.conf

 

# shard2 config

systemLog:

  destination: file

  logAppend: true

  path: /opt/mongodb/mongodb-app/data/shard2/log/shard2.log

 

# Where and how to store data.

storage:

  dbPath: /opt/mongodb/mongodb-app/data/shard2/data

  journal:

    enabled: true

  wiredTiger:

    engineConfig:

      cacheSizeGB: 20

 

# how the process runs

processManagement:

  fork: true

  pidFilePath: /opt/mongodb/mongodb-app/data/shard2/log/shard2.pid

 

# network interfaces

net:

  port: 27002

  bindIp: 0.0.0.0

 

#operationProfiling:

replication:

  replSetName: shard2

sharding:

  clusterRole: shardsvr

 

 

4.2.2.2.启动三台服务器的shard2 server

 

[mongodb@server4 mongodb]$ mongod --config /opt/mongodb/mongodb-app/conf/shard2.conf

about to fork child process, waiting until server is ready for connections.

forked process: 8025

child process started successfully, parent exiting

[mongodb@server4 mongodb]$

 

[mongodb@server1 mongodb]$ mongod --shutdown --config /opt/mongodb/mongodb-app/conf/shard2.conf

killing process with pid: 8420

 

4.2.2.3.登录任意一台配置服务器,初始化配置副本集

 

[mongodb@server4 mongodb]$ mongo 192.168.1.34:27002

MongoDB shell version v3.6.4

connecting to: mongodb://192.168.1.34:27002/test

MongoDB server version: 3.6.4

Welcome to the MongoDB shell.

> use admin;

switched to db admin

> config = {

  _id : "shard2",

  members : [

  {_id : 0, host : "192.168.1.34:27002" },

  {_id : 1, host : "192.168.1.31:27002" },

  {_id : 2, host : "192.168.1.32:27002" }

  ]

 }

> rs.initiate(config);

{ "ok" : 1 }

shard2:OTHER> rs.status();

 

4.2.3.  shard3配置

 

4.2.3.1.创建配置文件

服务器333431上配置以下内容

 

[mongodb@server3 mongodb]$ vi /opt/mongodb/mongodb-app/conf/shard3.conf

 

# shard3 config

systemLog:

  destination: file

  logAppend: true

  path: /opt/mongodb/mongodb-app/data/shard3/log/shard3.log

 

# Where and how to store data.

storage:

  dbPath: /opt/mongodb/mongodb-app/data/shard3/data

  journal:

    enabled: true

  wiredTiger:

    engineConfig:

      cacheSizeGB: 20

 

# how the process runs

processManagement:

  fork: true

  pidFilePath: /opt/mongodb/mongodb-app/data/shard3/log/shard3.pid

 

# network interfaces

net:

  port: 27003

  bindIp: 0.0.0.0

 

#operationProfiling:

replication:

  replSetName: shard3

sharding:

  clusterRole: shardsvr

 

4.2.3.2.启动三台服务器的shard3 server

 

[mongodb@server3 mongodb]$ mongod --config /opt/mongodb/mongodb-app/conf/shard3.conf

about to fork child process, waiting until server is ready for connections.

forked process: 8179

child process started successfully, parent exiting

[mongodb@server3 mongodb]$

 

[mongodb@server1 mongodb]$ mongod --shutdown --config /opt/mongodb/mongodb-app/conf/shard3.conf

killing process with pid: 8420

 

 

4.2.3.3.登录任意一台配置服务器,初始化配置副本集

 

[mongodb@server3 mongodb]$ mongo 192.168.1.33:27003

MongoDB shell version v3.6.4

connecting to: mongodb://192.168.1.33:27003/test

MongoDB server version: 3.6.4

Welcome to the MongoDB shell.

> use admin

switched to db admin

> config = {

  _id : "shard3",

  members : [

  {_id : 0, host : "192.168.1.33:27003" },

  {_id : 1, host : "192.168.1.34:27003" },

  {_id : 2, host : "192.168.1.31:27003" }

  ]

 }

> rs.initiate(config);

{ "ok" : 1 }

shard3:OTHER> rs.status();

 

4.2.4.  shard4配置

 

4.2.4.1.创建配置文件

服务器323334上配置以下内容

 

[mongodb@server2 mongodb]$ vi /opt/mongodb/mongodb-app/conf/shard4.conf                                                                                                 

 

# shard4 config

systemLog:

  destination: file

  logAppend: true

  path: /opt/mongodb/mongodb-app/data/shard4/log/shard4.log

 

# Where and how to store data.

storage:

  dbPath: /opt/mongodb/mongodb-app/data/shard4/data

  journal:

    enabled: true

  wiredTiger:

    engineConfig:

      cacheSizeGB: 20

 

# how the process runs

processManagement:

  fork: true

  pidFilePath: /opt/mongodb/mongodb-app/data/shard4/log/shard4.pid

 

# network interfaces

net:

  port: 27004

  bindIp: 0.0.0.0

 

#operationProfiling:

replication:

  replSetName: shard4

sharding:

  clusterRole: shardsvr

 

4.2.4.2.启动三台服务器的shard4 server

 

[mongodb@server2 mongodb]$ mongod --config /opt/mongodb/mongodb-app/conf/shard4.conf

about to fork child process, waiting until server is ready for connections.

forked process: 8436

child process started successfully, parent exiting

[mongodb@server4 mongodb]$ mongod --shutdown --config /opt/mongodb/mongodb-app/conf/shard4.conf

killing process with pid: 8238

 

4.2.4.3.登录任意一台配置服务器,初始化配置副本集

 

[mongodb@server2 mongodb]$ mongo 192.168.1.32:27004

MongoDB shell version v3.6.4

connecting to: mongodb://192.168.1.32:27004/test

MongoDB server version: 3.6.4

Welcome to the MongoDB shell.

For interactive help, type "help".

> use admin

switched to db admin

> config = {

  _id : "shard4",

  members : [

  {_id : 0, host : "192.168.1.32:27004" },

  {_id : 1, host : "192.168.1.33:27004" },

  {_id : 2, host : "192.168.1.34:27004" }

  ]

 }

> rs.initiate(config);

{ "ok" : 1 }

shard4:OTHER> rs.status();

 

4.3.  配置路由服务器 mongos

 

4.3.1.  创建配置文件

服务器3134上配置以下内容

 

[mongodb@server1 mongodb]$ vi /opt/mongodb/mongodb-app/conf/mongos.conf

 

systemLog:

  destination: file

  logAppend: true

  path: /opt/mongodb/mongodb-app/data/mongos/log/mongos.log

processManagement:

  fork: true

#  pidFilePath: /opt/mongodb/mongodb-app/data/mongos/log/mongos.pid

 

# network interfaces

net:

  port: 20000

  bindIp: 0.0.0.0

# 监听的配置服务器,只能有1个或者3 config为配置服务器的副本集名字

sharding:

   configDB: config/192.168.1.31:21000,192.168.1.32:21000,192.168.1.33:21000

 

4.3.2.  启动三台服务器的config server

注意:先启动配置服务器和分片服务器,后启动路由实例

 

[mongodb@server1 conf]$ mongos --config /opt/mongodb/mongodb-app/conf/mongos.conf

about to fork child process, waiting until server is ready for connections.

forked process: 9869

child process started successfully, parent exiting

4.4.  启用分片

 

目前搭建了mongodb配置服务器、路由服务器,各个分片服务器,不过应用程序连接到mongos路由服务器并不能使用分片机制,还需要在程序里设置分片配置,让分片生效。

 

登陆任意一台mongos

 

[mongodb@server1 data]$ mongo 192.168.1.31:20000/admin

MongoDB shell version v3.6.4

connecting to: mongodb://192.168.1.31:20000/admin

MongoDB server version: 3.6.4

mongos> use admin;

switched to db admin

mongos> sh.addShard("shard1/192.168.1.31:27001,192.168.1.32:27001,192.168.1.33:27001")

 

mongos> sh.addShard("shard2/192.168.1.34:27002,192.168.1.31:27002,192.168.1.32:27002")

 

mongos> sh.addShard("shard3/192.168.1.33:27003,192.168.1.34:27003,192.168.1.31:27003")

 

mongos> sh.addShard("shard4/192.168.1.32:27004,192.168.1.33:27004,192.168.1.34:27004")

 

mongos> sh.status()

--- Sharding Status ---

  sharding version: {

        "_id" : 1,

        "minCompatibleVersion" : 5,

        "currentVersion" : 6,

        "clusterId" : ObjectId("5b00d641da35619896a78891")

  }

  shards:

        {  "_id" : "shard1", "host" : "shard1/192.168.1.31:27001,192.168.1.32:27001,192.168.1.33:27001",  "state" : 1 }

        {  "_id" : "shard2", "host" : "shard2/192.168.1.31:27002,192.168.1.32:27002,192.168.1.34:27002",  "state" : 1 }

        {  "_id" : "shard3", "host" : "shard3/192.168.1.31:27003,192.168.1.33:27003,192.168.1.34:27003",  "state" : 1 }

        {  "_id" : "shard4", "host" : "shard4/192.168.1.32:27004,192.168.1.33:27004,192.168.1.34:27004",  "state" : 1 }

  active mongoses:

        "3.6.4" : 2

  autosplit:

        Currently enabled: yes

  balancer:

        Currently enabled:  yes

        Currently running:  no

        Failed balancer rounds in last 5 attempts:  0

        Migration Results for the last 24 hours:

                No recent migrations

  databases:

        {  "_id" : "config", "primary" : "config",  "partitioned" : true }

 

mongos>

mongos> sh.getBalancerState()

true

mongos> sh.startBalancer()

{

        "ok" : 1,

        "$clusterTime" : {

                "clusterTime" : Timestamp(1526784919, 4),

                "signature" : {

                        "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),

                        "keyId" : NumberLong(0)

                }

        },

        "operationTime" : Timestamp(1526784919, 4)

}

 

5.  验证

5.1.  列出shard

[mongodb@server1 data]$ mongo 192.168.1.31:20000/admin

MongoDB shell version v3.6.4

connecting to: mongodb://192.168.1.31:20000/admin

MongoDB server version: 3.6.4

mongos> use admin;

switched to db admin

mongos> db.runCommand({ listshards:1 });   #列出 shard 个数

{

        "shards" : [

                {

                        "_id" : "shard1",

                        "host" : "shard1/192.168.1.31:27001,192.168.1.32:27001,192.168.1.33:27001",

                        "state" : 1

                },

                {

                        "_id" : "shard2",

                        "host" : "shard2/192.168.1.31:27002,192.168.1.32:27002,192.168.1.34:27002",

                        "state" : 1

                },

                {

                        "_id" : "shard3",

                        "host" : "shard3/192.168.1.31:27003,192.168.1.33:27003,192.168.1.34:27003",

                        "state" : 1

                },

                {

                        "_id" : "shard4",

                        "host" : "shard4/192.168.1.32:27004,192.168.1.33:27004,192.168.1.34:27004",

                        "state" : 1

                }

        ],

        "ok" : 1,

        "$clusterTime" : {

                "clusterTime" : Timestamp(1526786005, 2),

                "signature" : {

                        "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),

                        "keyId" : NumberLong(0)

                }

        },

        "operationTime" : Timestamp(1526786005, 2)

}

 

5.2.  创建 Shard功能的数据库与表

 

[mongodb@server1 data]$ mongo 192.168.1.31:20000/admin

MongoDB shell version v3.6.4

connecting to: mongodb://192.168.1.31:20000/admin

MongoDB server version: 3.6.4

mongos> use admin;

switched to db admin

mongos> db.runCommand({ enablesharding: "RHY" });

{

        "ok" : 1,

        "$clusterTime" : {

                "clusterTime" : Timestamp(1526791888, 10),

                "signature" : {

                        "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),

                        "keyId" : NumberLong(0)

                }

        },

        "operationTime" : Timestamp(1526791888, 10)

}

mongos> db.runCommand({ shardcollection: "RHY.ST_RIVER_R", key: {STCD: 1}, unique: false })

{

        "collectionsharded" : "RHY.ST_RIVER_R",

        "collectionUUID" : UUID("370a8a62-cd5d-488a-b752-6fedba5da507"),

        "ok" : 1,

        "$clusterTime" : {

                "clusterTime" : Timestamp(1526791904, 14),

                "signature" : {

                        "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),

                        "keyId" : NumberLong(0)

                }

        },

        "operationTime" : Timestamp(1526791904, 14)

}

mongos>

 

5.3.  向单个MongoDB中导入CSV数据

 

[root@server1 bin]# ./mongoimport -d RHY -c ST_STBPRP_B --type csv --headerline --file /opt/mongodb/mydata/ST_STBPRP_B.CSV

2018-05-20T16:18:37.849+0800    connected to: localhost

2018-05-20T16:18:38.748+0800    imported 27276 documents

 

5.4.  Shard中导入数据

5.4.1.  安装Python

pip install pymongo

 

5.4.2.  Python访问MongoDBShard集群

 

import sys

import json

import pymongo

import datetime

from pymongo import MongoClient

 

client = MongoClient('mongodb://192.168.1.31:20000,192.168.1.34:20000')

db = client.RHY

collection = db.ST_RIVER_R

 

f = open("D:/bigdata/st_river_r.CSV")

line = f.readline()

print(line)

fieldNames = line.split(',')

# STCD,TM,Z,Q,XSA,XSAVV,XSMXV,FLWCHRCD,WPTN,MSQMT,MSAMT,MSVMT

line = f.readline()

count = 0

records = []

insertCount = 0

while line:

    #

    count = count + 1

    fieldValues = line.split(',')

    iflen(fieldValues) == 12or fieldValues[0].strip() != '':

        insertObj = {}

        STCD = fieldValues[0]

        insertObj['STCD'] = STCD

        TM = fieldValues[1]

        if TM.strip() != '':

            TM = datetime.datetime.strptime(TM, '%Y-%m-%d %H:%M:%S')

            insertObj['TM'] = TM

        Z = fieldValues[2]

        if Z.strip() != '':

            Z = float(Z)

            insertObj['Z'] = Z

        Q = fieldValues[3]

        if Q.strip() != '':

            Q = float(Q)

            insertObj['Q'] = Q

        # XSA

        XSA = fieldValues[4]

        if XSA.strip() != '':

            XSA = float(XSA)

            insertObj['XSA'] = XSA

        # XSAVV

        XSAVV = fieldValues[5]

        if XSAVV.strip() != '':

            XSAVV = float(XSAVV)

            insertObj['XSAVV'] = XSAVV

        #

        XSMXV = fieldValues[6]

        if XSMXV.strip() != '':

            XSMXV = float(XSMXV)

            insertObj['XSMXV'] = XSMXV

        #

        FLWCHRCD = fieldValues[7]

        if FLWCHRCD.strip() != '':

            insertObj['FLWCHRCD'] = FLWCHRCD

        #

        WPTN = fieldValues[8]

        if WPTN.strip() != '':

            insertObj['WPTN'] = WPTN

        #

        MSQMT = fieldValues[9]

        if MSQMT.strip() != '':

            insertObj['MSQMT'] = MSQMT

        #

        MSAMT = fieldValues[10]

        if MSAMT.strip() != '':

            insertObj['MSAMT'] = MSAMT

        #

        MSVMT = fieldValues[11]

        if MSVMT.strip() != '':

            insertObj['MSVMT'] = MSVMT

        #

        # collection.insert_one(insertObj)

        # collection.insert_many(new_posts)

        records.append(insertObj)

        iflen(records) == 1000:

            insertCount = insertCount + 1

            if count > 1451000:

                collection.insert_many(records)

                print(str(count) + '  ' + str(insertCount))

            print(count)

            records = []

    else:

        print(line)

    #

    line = f.readline()

 

f.close()

client.close()

 

 

5.5.  从单个MongoDB中备份数据

 

[root@server1 bin]# ./mongodump -h 127.0.0.1:27017 -d RHY -o /opt/mongodb/mydata/dump

 

5.6.  Shard中恢复数据

 

 

[mongodb@server1 bin]$ ./mongorestore -h 192.168.1.31 --port 20000 -d RHY /opt/mongodb/mydata/dump/RHY

2018-05-20T17:15:20.762+0800    the --db and --collection args should only be used when restoring from a BSON file. Other uses are deprecated and will not exist in the future; use --nsInclude instead

2018-05-20T17:15:20.763+0800    building a list of collections to restore from /opt/mongodb/mydata/dump/RHY dir

2018-05-20T17:15:20.767+0800    reading metadata for RHY.ST_RIVER_R from /opt/mongodb/mydata/dump/RHY/ST_RIVER_R.metadata.json

2018-05-20T17:15:20.767+0800    reading metadata for RHY.ST_STBPRP_B from /opt/mongodb/mydata/dump/RHY/ST_STBPRP_B.metadata.json

2018-05-20T17:15:20.779+0800    restoring RHY.ST_RIVER_R from /opt/mongodb/mydata/dump/RHY/ST_RIVER_R.bson

2018-05-20T17:15:20.779+0800    restoring RHY.ST_STBPRP_B from /opt/mongodb/mydata/dump/RHY/ST_STBPRP_B.bson

2018-05-20T17:15:23.772+0800    [####....................]  RHY.ST_STBPRP_B  2.61MB/13.6MB (19.1%)

2018-05-20T17:15:23.772+0800    [........................]   RHY.ST_RIVER_R   849KB/2.89GB   (0.0%)

2018-05-20T17:15:23.772+0800

2018-05-20T17:15:26.759+0800    [##########..............]  RHY.ST_STBPRP_B  6.04MB/13.6MB (44.4%)

2018-05-20T17:15:26.759+0800    [........................]   RHY.ST_RIVER_R  2.15MB/2.89GB   (0.1%)

2018-05-20T17:15:26.759+0800

2018-05-20T17:15:29.759+0800    [#####################...]  RHY.ST_STBPRP_B  12.0MB/13.6MB (87.9%)

2018-05-20T17:15:29.759+0800    [........................]   RHY.ST_RIVER_R  4.63MB/2.89GB   (0.2%)

2018-05-20T17:15:29.759+0800

2018-05-20T17:15:31.447+0800    [########################]  RHY.ST_STBPRP_B  13.6MB/13.6MB (100.0%)

2018-05-20T17:15:31.447+0800    no indexes to restore

2018-05-20T17:15:31.447+0800    finished restoring RHY.ST_STBPRP_B (27276 documents)

2018-05-20T17:15:32.758+0800    [........................]  RHY.ST_RIVER_R  6.79MB/2.89GB (0.2%)

2018-05-20T17:15:35.758+0800    [........................]  RHY.ST_RIVER_R  8.61MB/2.89GB (0.3%)

2018-05-20T17:15:38.758+0800    [........................]  RHY.ST_RIVER_R  11.9MB/2.89GB (0.4%)

2018-05-20T17:15:41.758+0800    [........................]  RHY.ST_RIVER_R  15.7MB/2.89GB (0.5%)

 

[mongodb@server4 conf]$ mongo 192.168.1.31:20000

MongoDB shell version v3.6.4

connecting to: mongodb://192.168.1.31:20000/test

MongoDB server version: 3.6.4

Server has startup warnings:

2018-05-20T17:01:28.017+0800 I CONTROL  [main]

2018-05-20T17:01:28.018+0800 I CONTROL  [main] ** WARNING: Access control is not enabled for the database.

2018-05-20T17:01:28.018+0800 I CONTROL  [main] **          Read and write access to data and configuration is unrestricted.

2018-05-20T17:01:28.018+0800 I CONTROL  [main]

mongos> use RHY

switched to db RHY

mongos> db.ST_RIVER_R.count()

1589817

mongos> db.ST_RIVER_R.findOne()

{

        "_id" : ObjectId("5b012fdee4a39884b75e4880"),

        "STCD" : 60812000,

        "TM" : "2011-04-13 02:00:00",

        "Z" : 405,

        "Q" : 60,

        "XSA" : "",

        "XSAVV" : "",

        "XSMXV" : "",

        "FLWCHRCD" : "",

        "WPTN" : 4,

        "MSQMT" : 1,

        "MSAMT" : "",

        "MSVMT" : ""

}

mongos> db.ST_RIVER_R.find({"STCD":60812000})

{ "_id" : ObjectId("5b012fdee4a39884b75e4880"), "STCD" : 60812000, "TM" : "2011-04-13 02:00:00", "Z" : 405, "Q" : 60, "XSA" : "", "XSAVV" : "", "XSMXV" : "", "FLWCHRCD" : "", "WPTN" : 4, "MSQMT" : 1, "MSAMT" : "", "MSVMT" : "" }

{ "_id" : ObjectId("5b012fdee4a39884b75e48ad"), "STCD" : 60812000, "TM" : "2011-04-19 02:00:00", "Z" : 377.71, "Q" : 13.3, "XSA" : "", "XSAVV" : "", "XSMXV" : "", "FLWCHRCD" : "", "WPTN" : 6, "MSQMT" : 1, "MSAMT" : "", "MSVMT" : "" }

 

原文地址:https://www.cnblogs.com/gispathfinder/p/9064039.html