phxpaxos的checkpoint从哪个server同步

一、状态同步

在工程应用环境中，一个新启动的节点需要能够从其它节点进行状态同步，或者叫做实例的对齐(Learn)。该节说明了C的数据可以从B学习，但是在一个具体的环境中，当一个节点需要学习时，它具体应该如何选择，以哪一个为准，这里并没有后详细说明清楚，这个就需要结合代码来看一下。

二、学习定时器

在实例类Instance启动的时候，它首先启动定时器，定时尝试向同组中的所有机器学习：
int Instance :: Init()
{
……
m_oLearner.Reset_AskforLearn_Noop();

PLGImp("OK");

return 0;
}

void Learner :: AskforLearn_Noop(const bool bIsStart)
{
Reset_AskforLearn_Noop();

m_bIsIMLearning = false;

m_poCheckpointMgr->ExitCheckpointMode();

AskforLearn();

if (bIsStart)
{
AskforLearn();
}
}

三、定时器到来时执行的动作

向系统广播 MsgType_PaxosLearner_AskforLearn 消息，所有收到该消息的节点通过MsgType_PaxosLearner_SendNowInstanceID协议返回自己本地最大的实例号。
void Learner :: AskforCheckpoint(const nodeid_t iSendNodeID)
{
PLGHead("START");

int ret = m_poCheckpointMgr->PrepareForAskforCheckpoint(iSendNodeID);
if (ret != 0)
{
return;
}

PaxosMsg oPaxosMsg;

oPaxosMsg.set_instanceid(GetInstanceID());
oPaxosMsg.set_nodeid(m_poConfig->GetMyNodeID());
oPaxosMsg.set_msgtype(MsgType_PaxosLearner_AskforCheckpoint);

PLGHead("END InstanceID %lu MyNodeID %lu", GetInstanceID(), oPaxosMsg.nodeid());

SendMessage(iSendNodeID, oPaxosMsg);
}

leaner执行AskforCheckpoint函数准备学习，但是在PrepareForAskforCheckpoint函数中会判断当前是否已经收到半数以上的节点回复，并且通过m_bInAskforCheckpointMode标志位进入checkpointmode
int CheckpointMgr :: PrepareForAskforCheckpoint(const nodeid_t iSendNodeID)
{
if (m_setNeedAsk.find(iSendNodeID) == m_setNeedAsk.end())
{
m_setNeedAsk.insert(iSendNodeID);
}

if (m_llLastAskforCheckpointTime == 0)
{
m_llLastAskforCheckpointTime = Time::GetSteadyClockMS();
}

uint64_t llNowTime = Time::GetSteadyClockMS();
if (llNowTime > m_llLastAskforCheckpointTime + 60000)
{
PLGImp("no majority reply, just ask for checkpoint");
}
else
{

if ((int)m_setNeedAsk.size() < m_poConfig->GetMajorityCount())
{
PLGImp("Need more other tell us need to askforcheckpoint");
return -2;
}
}

m_llLastAskforCheckpointTime = 0;
m_bInAskforCheckpointMode = true;

return 0;
}

四、有其它节点的MsgType_PaxosLearner_SendNowInstanceID包如何处理

从代码上看，当开始学习之后，就是一个专心致志的过程，那最开广播的消息，其它节点回包也是满足PrepareForAskforCheckpoint函数的半数以上条件，那会不会它们都会被同步过来呢？

void Instance :: OnReceive(const std::string & sBuffer)
{
BP->GetInstanceBP()->OnReceive();
……

int iCmd = oHeader.cmdid();

if (iCmd == MsgCmd_PaxosMsg)
{
if (m_oCheckpointMgr.InAskforcheckpointMode())
{
PLGImp("in ask for checkpoint mode, ignord paxosmsg");
return;
}

……

OnReceivePaxosMsg(oPaxosMsg);
}
……
}
由于发送学习checkpoint之后就通过m_bInAskforCheckpointMode进入了checkpoint模式，而MsgType_PaxosLearner_SendNowInstanceID回包是一个paxos类型的回包，所以要经过if (m_oCheckpointMgr.InAskforcheckpointMode())的过滤，并且当前处于checkpoint学习模式时，这个消息会被忽略。
const bool CheckpointMgr :: InAskforcheckpointMode() const
{
return m_bInAskforCheckpointMode;
}

五、总结

具体的学习对象是从第一个超过半数的节点学习。在CheckpointMgr :: PrepareForAskforCheckpoint函数中的m_setNeedAsk并没有在学习之后清零，可能是由于checkpoint同步之后系统就会重启，所以不需要清零吧。