[译]rabbitmq 2.5 Where’s my message? Durability and you

我对rabbitmq学习还不深入，这些翻译仅仅做资料保存，希望不要误导大家。

There’s a dirty secret about creating queues and exchanges in Rabbit: by default they

don’t survive reboot. That’s right; restart your RabbitMQ server and watch those

queues and exchanges go poof (along with the messages inside). The reason is

because of a property on every queue and exchange called durable. It defaults to

false, and tells RabbitMQ whether the queue (or exchange) should be re-created after

a crash or restart of Rabbit. Set it to true and you won’t have to re-create those queues

and exchanges when the power supply in your server dies. You might also think that

setting durable to true on the exchanges and queues is all you need to do to make

your messages survive a reboot, but you’d be wrong.

rabbitmq中的一个肮脏的秘密：默认不支持重启。

重启你的rabbitmq服务器，queue和exchange，连同里面的消息就没了。

因为每个queue和exchange有个属性叫durable，默认值为false，

这个属性决定了在rabbitmq崩溃或重启后，是否重新创建queue或exchange。

你也许会认为把durable设为true就能支持rabbitmq服务器重启？那你就错了。

Whereas queues and exchanges

must be durable to allow messages to survive reboot, it isn’t enough on its own.

A message that can survive a crash of the AMQP broker is called persistent. You flag a

message as persistent by setting the delivery mode option of the message to 2 (your

AMQP client may use a human-friendly constant instead) before publishing it.

消息支持AMQP broker的崩溃被称为持久化。

你可以在投递消息之前，通过设置delivery mode为2，来达到消息持久化。

At this

point, the message is indicated as persistent, but it must be published to an exchange

that’s durable and arrive in a queue that’s durable to survive. If this weren’t the case,

the queue (or exchange) a persistent message was sitting in when Rabbit crashed

wouldn’t exist when Rabbit restarted, thereby orphaning the message. So, for a message

that’s in flight inside Rabbit to survive a crash, the message must

~ Have its delivery mode option set to 2 (persistent)

~ Be published into a durable exchange

~ Arrive in a durable queue

Do these three things and you won't have to play Where’s Waldo with your critical

messages.

这样，消息就是持久化的了，但是必须被投递到持久化的exchange和持久化的queue中。

否则，重启之后，持久化的消息也会不复存在。

所以，如果要消息持久化，必须：

~ delivery mode为2

~ 投递到持久化exchange中

~ 投递到持久化的queue

The way that RabbitMQ ensures persistent messages survive a restart is by writing

them to the disk inside of a persistency log file. When you publish a persistent message

to a durable exchange, Rabbit won’t send the response until the message is committed

to the log file. Keep in mind, though, that if it gets routed to a nondurable

queue after that, it’s automatically removed from the persistency log and won’t survive

a restart. When you use persistent messages it’s crucial that you make sure all three

elements required for a message to persist are in place (we can’t stress this enough).

Once you consume a persistent message from a durable queue (and acknowledge it),

RabbitMQ flags it in the persistency log for garbage collection. If Rabbit restarts anytime

before you consume a persistent message, it’ll automatically re-create the

exchanges and queues (and bindings) and replay any messages in the persistency log

into the appropriate queues or exchanges (depending on where in the routing process

the messages were when Rabbit died).

rabbitmq实现消息持久化的方法是把消息写入磁盘的日志文件中。

当你投递一个持久化消息到持久化exchange中，rabbitmq不会回应，直到消息被写入日志文件。

记住，如果消息被路由到一个非持久化的queue中，该消息会自动的从日志中删除，并且不会

支持重启。

当你从一个持久化queue中消费了一个消息，并且回复了acknowledge。

rabbitmq会在日志中设置标记，表示该消息已回收。

如果rabbitmq在你消费持久化消息之前重启了，他会自动重新创建exchange和queue和binding，

并且通过日志文件把消息恢复到对应的queue中，或者重新路由消息（取决于rabbitmq什么时候挂的）。

You might be thinking that you should use persistent messaging for all of your messages.

You could do that, but you’d pay a price for ensuring your messages survive

Rabbit restarts: performance. The act of writing messages to disk is much slower than

just storing them in RAM, and will significantly decrease the number of messages per

second your RabbitMQ server can process.

It’s not uncommon to see a 10x or more

decrease in message throughput when using persistency.1 There’s also the issue that

persistent messages don’t play well with RabbitMQ’s built-in clustering. Though

RabbitMQ clustering allows you to talk to any queue present in the cluster from any

node, those queues are actually evenly distributed among the nodes without redundancy

(there’s no backup copy of any queue on a second node in the cluster). If the

cluster node hosting your seed_bin queue crashes, the queue disappears from the

cluster until the node is restored … if the queue was durable. More important, while

the node is down its queues aren’t available and the durable ones can’t be re-created.

This can lead to black-holing of messages. We’ll cover the behavior in more detail and

show alternate clustering approaches to get around this in chapter 5.

你也许在想，应该所有消息都持久化吧。

你可以这么做，但是你要付出性能的代价。写磁盘比写内存慢得多。

并且显著降低rabbitmq服务器每秒处理的消息。

Given the trade-offs, when should you use persistent/durable messaging? First, you

need to analyze (and test) your performance needs. Do you need to process 100,000

messages per second on a single Rabbit server? If so, you should probably look at

other ways of ensuring message delivery (or get a very fast storage system). For example,

your producer could listen to a reply queue on a separate channel. Every time it

publishes a message, it includes the name of the reply queue so that the consumer can

send a reply back to confirm receipt. If a message isn’t replied to within a reasonable

amount of time, the producer can republish the message. That said, the critical

nature of messages requiring guaranteed delivery generally means they’re lower in

volume than other types of messages (such as logging messages). So if persistent messaging

meets your performance needs, it’s an excellent way to help ensure delivery.

We use it a lot for critical messages. We’re just selective about what types of content

use persistent messaging. For example, we run two types of Rabbit clusters: traditional

RabbitMQ clustering for nonpersistent messaging, and pairs of active/hot-standby

nonclustered Rabbit servers for persistent messaging (using load balancers). This

ensures the processing load for persistent messaging doesn’t slow down nonpersistent

messages. It also means Rabbit’s built-in clustering won’t black-hole persistent messages

when a node dies. Do keep in mind that while Rabbit can help ensure delivery, it

can never absolutely guarantee it. Hard drive corruption, buggy behavior by a consumer,

or other extreme events can trash/black-hole persistent messages. It’s ultimately

up to you to ensure your messages arrive where they need to go, and persistent

messaging is a great tool to help you get there.

A concept that’s related to the durability of a message is the AMQP transaction. So

far we’ve talked about marking messages, queues, and exchanges as durable. That’s all

well and good for keeping a message safe once RabbitMQ has it in its custody, but

since a publish operation returns no response to the producer, how do you know if

the broker has persisted the durable message to disk? Should the broker die before it

can write the message to disk, the message would be lost and you wouldn’t know.

That’s where transactions come in. When you absolutely need to be sure the broker

has the message in custody (and has routed the message to all matching subscribed

queues) before you move on to another task, you need to wrap it in a transaction. If

you come from a database background, it’s important not to confuse AMQP transactions

with what “transaction” means in most databases. In AMQP, after you place a

channel into transaction mode, you send it the publish you want to confirm, followed

by zero or more other AMQP commands that should be executed or ignored depending

on whether the initial publish succeeded. Once you’ve sent all of the commands,

you commit the transaction. If the transaction’s initial publish succeeds, then the channel

will complete the other AMQP commands in the transaction. If the publish fails,

none of the other AMQP commands will be executed. Transactions close the “last

mile” gap between producers publishing messages and RabbitMQ committing them

to disk, but there’s a better way to close that gap.

Though transactions are a part of the formal AMQP 0-9-1 specification, they have

an Achilles heel in that they’re huge drains on Rabbit performance. Not only can

using transactions drop your message throughput by a factor of 2–10x, but they also

make your producer app synchronous, which is one of the things you’re trying to get

rid of with messaging. Knowing all of this, the guys at RabbitMQ decided to come up

with a better way to ensure message delivery: publisher confirms.2 Similar to transactions,

you have to tell Rabbit to place the channel into confirm mode, and you can’t turn it

off without re-creating the channel. Once a channel is in confirm mode, every message

published on the channel will be assigned a unique ID number (starting at 1).

Once the message has been delivered to all queues that have bindings matching the

message’s routing key, the channel will issue a publisher confirm to the producer app

(containing the message’s unique ID). This lets the producer know the message has

been safely queued at all of its destinations. If the message and the queues are durable,

the confirm is issued only after the queues have written the message to disk. The

major benefit of publisher confirms is that they’re asynchronous. Once a message has

been published, the producer app can go on to the next message while waiting for the

confirm. When the confirm for that message is finally received, a callback function in

the producer app will be fired so it can wake up and handle the confirmation. If an

internal error occurs inside Rabbit that causes a message to be lost, Rabbit will send a

message nack (not acknowledged) that’s like a publisher confirm (it has the message’s

unique ID) but indicates the message was lost. Also, since there’s no concept of message

rollback (as with transactions), publisher confirms are much lighter weight and

have an almost negligible performance hit on the Rabbit broker.

Now you have the individual parts of RabbitMQ down, from consumers and producers

to durable messaging, but how do they all fit together? What does the lifecycle

of an actual message look like from beginning to end? The best way to answer that is

to look at the life of a message in code.

to do:还没译完