ceph monitor---总结1

参考

ceph代码和名词解释 http://accelazh.github.io/ceph/Ceph-Code-Deep-Dive

ceph 网页书籍  http://blog.sina.com.cn/s/blog_153c9453d0102xvwi.html

ceph-deploy源码剖析  有其他博客  http://www.hl10502.com/2017/06/15/ceph-deploy-cli/

ceph在终端敲命令后怎么调用 https://blog.csdn.net/qq_36118718/article/details/79195621

Ceph monitor and paxos http://catkang.github.io/2016/07/17/ceph-monitor-and-paxos.html

Ceph monitor 实现 https://www.jianshu.com/p/60b34ba5cdf2

ceph 源码剖析 https://blog.csdn.net/qq_36118718/article/details/79234737

src/ceph_mon.cc    monitor启动流程 main()函数: https://blog.csdn.net/kakaxi8891/article/details/10921297

ceph-monmap命令处理流程 https://blog.csdn.net/carny/article/details/52583560

src/mon/

monitor leader选举源码分析 https://blog.csdn.net/sy_yu/article/details/79102202

                                                         https://segmentfault.com/a/1190000010413557

博客    https://blog.csdn.net/sy_yu?t=1

monitor leader选举 https://blog.csdn.net/sy_yu/article/details/79102202

Monitor 选举机制 https://blog.csdn.net/scaleqiao/article/details/52315468

https://www.cnblogs.com/shanno/p/3967116.html

MonitorDB源码分析 monmap同步  https://blog.csdn.net/qq_36118718/article/details/79234737

paxos 源码注释  https://blog.csdn.net/fishermandong/article/details/72805660

paxos 源码 phase1 https://blog.csdn.net/fishermandong/article/details/72805660

                  phase2  https://blog.csdn.net/fishermandong/article/details/76360237

paxos 算法 https://blog.csdn.net/qq_36118718/article/details/79134887

                       https://blog.csdn.net/skdkjzz/article/details/41979521

Maps   https://ceph-doc.readthedocs.io/en/latest/Monitor/

PGmap,OSDmap  http://ju.outofmemory.cn/entry/76367

A Ceph Monitor maintains a master copy of the cluster map. A robust ceph cluster usually contains a cluster of monitors which provide the cluster map to the clients. 

the basic framework of a monitor

A ceph monitor consists of  K/V store, paxos and the paxosService. The K/V store is for persistent store of monitor data. The paxos provides consistent data access logic for the paxosService layer. Each paxosService represents a kind of state information of the cluster. They change their data to the form of Key-value and then write to the paxos layer.

the initialization and leader election of a monitor

The monitor will connect other monitors according to the monmap once it starts or restarts.  If it starts at the first time, it needs to build the monmap using the ceph configuration file and store it to the MonitorDBStore. If not the first time, it gets the monmap from MonitorDBStore. So once a monitor starts, it initializes the MonitorDBStore. Messenger is the network thread module. The monitor initializes it and registers the callback function which will be executed after the reception of requests. The paxos and paxosService will be described in detail later. The bootstrap process will be called time and again,which plays an important role in the lifecycle of a monitor.

After bootstrap, the monitor is in STATE_PROBING, it communicates and synchronizes with other monitors. After synchronization the cluster starts election, and decides the roles of the monitors. The detailed process is as follows.

Probing and synchronizing process:

 Leader election process:

Paxos : recovery and propose

The following data structures are important in Paxos. They need to be kept in the DBStore.

 
 
 
 
last_pn Last Proposal Number
accepted_pn The last Proposal Number we have accepted.On the Leader, it will be the Proposal Number picked by the Leader itself. On the Peon, however, it will be the proposal sent by the Leader and it will only be updated if its value is higher than the one already known by the Peon.
uncommitted_pn Uncommitted value's Proposal Number.We use this variable to assess if the Leader should take into consideration an uncommitted value sent by a Peon. Given that the Peon will send back to the Leader the last Proposal Number it accepted, the Leader will be able to infer if this value is more recent than the one the Leader has, thus more relevant.
first_committed First committed value's version
last_committed Last committed value's version. On both the Leader and the Peons, this is the last value's version that was accepted by a given quorum and thus committed, that this instance knows about.
uncommitted_v Uncommitted value's version.If we have, or end up knowing about, an uncommitted value, then its version will be kept in this variable.
uncommitted_value If the system fails in-between the accept replies from the Peons and the instruction to commit from the Leader, then we may end up with accepted but yet-uncommitted values. During the Leader's recovery, it will attempt to bring the whole system to the latest state, and that means committing past accepted but uncommitted values.
This variable will hold an uncommitted value, which may originate either on the Leader, or learnt by the Leader from a Peon during the collect phase.

After the leader election process, the roles of leader and peon are clear. Before the consistent read and write, the mon cluster should do phase1:RECOVERY to make PN( proposal number) consistent firstly. The flow is as follows.

After phase1, we go to phase2 which is the working flow of proposing, accepting and committing when the monitors are under normal working.

 

The detailed processes are as follows.

how the client's requests are dealt with

When the client sends a request to the monitor, the monitor firstly dispatches the request to the corresponding PaxosService. Then PaxosService calls the functions according to whether it's a reading operation or writing. And it decides whether the propose process should be triggered.

 

原文地址:https://www.cnblogs.com/yi-mu-xi/p/10361795.html