cache and database

This article referenced from http://coolshell.cn/articles/17416.html

We all know that high concurrency, high I/O is a big challenge to database. So we normally add a cache system in front of database. The cache system normally store data inside the RAM so it has better performance.

But there is one big problem for this kind of architecture. The cache and database are two independent storage system. So data manipulation can not be atom between the two system. In other words, there might be data inconsistent.

To address the inconsistent problem, there are several strategies for cache and db communication.

Cache Aside Pattern

When reading data, application first read the cache. If hit, then get back with data. If miss, then get data from database, store the data in cache so that later query can get data from cache directly.
When updating data, application first update the database, then disable the corresponding item in cache.

This strategy may cause dirty data. For example:

time point 1: process a read entry e1 from db
time point 2: process b update e1 in db to E1 and process b disable the corresponding cache entry e1 in cache
time point 3: process a put entry e1 in cache

Now the e1 is the dirty data because the real data should be E1 now.

But this can rarely happen. Because read operation normally faster than write.

Read/Write Through Pattern

In read through. The application does not know there is a cache or db. To application, there is only one storage layer. In read through, when read from cache fail(the cache entry timeout or swap out because of LRU), the cache will responsible for retrieve data from db and store in cache.

In write through. The application will try to update the data in cache first. If no find the data in cache then update DB. If find the data in cache then update the data in cache and cache system will update DB.

Write Behind Caching Pattern

When update, the application only update cache. The cache will update database in batch. This is an asynchronous operation. The cache will be write back to DB in several circumstance like not enough space in cache. So this is also called lazy write.

This strategy may cause data loss.

原文地址:https://www.cnblogs.com/kramer/p/5976250.html