Drupal启动阶段之二:页面缓存

页面缓存是什么意思?有些页面浏览量非常大,而且与状态无关,这类页面就可以使用页面缓存技术。在页面第一次请求完毕以后,将响应结果保存起来。下一次再请求同一页面时,就不需要从头到尾再执行一遍,只需要将第一次执行的响应结果获取出来,直接返回给使用者就行了。

什么样的页面请求可以缓存?Drupal使用函数drupal_page_is_cacheable()区分哪些请求可以缓存:

function drupal_page_is_cacheable($allow_caching = NULL) {
  $allow_caching_static = &drupal_static(__FUNCTION__, TRUE);
  if (isset($allow_caching)) {
    $allow_caching_static = $allow_caching;
  }

  return $allow_caching_static && ($_SERVER['REQUEST_METHOD'] == 'GET' || $_SERVER['REQUEST_METHOD'] == 'HEAD')
    && !drupal_is_cli();
}

默认,只有GET和HEAD请求可以缓存。

如何保存第一次的请求响应结果?Drupal使用函数drupal_page_set_cache()保存响应结果。在请求执行完成后,Drupal执行该函数,使用cache_set()将响应结果保存到cache_page对象:

function drupal_page_set_cache() {
  global $base_root;

  if (drupal_page_is_cacheable()) {
    $cache = (object) array(
      'cid' => $base_root . request_uri(),
      'data' => array(
        'path' => $_GET['q'],
        'body' => ob_get_clean(),
        'title' => drupal_get_title(),
        'headers' => array(),
      ),
      'expire' => CACHE_TEMPORARY,
      'created' => REQUEST_TIME,
    );

    // Restore preferred header names based on the lower-case names returned
    // by drupal_get_http_header().
    $header_names = _drupal_set_preferred_header_name();
    foreach (drupal_get_http_header() as $name_lower => $value) {
      $cache->data['headers'][$header_names[$name_lower]] = $value;
      if ($name_lower == 'expires') {
        // Use the actual timestamp from an Expires header if available.
        $cache->expire = strtotime($value);
      }
    }

    if ($cache->data['body']) {
      if (variable_get('page_compression', TRUE) && extension_loaded('zlib')) {
        $cache->data['body'] = gzencode($cache->data['body'], 9, FORCE_GZIP);
      }
      cache_set($cache->cid, $cache->data, 'cache_page', $cache->expire);
    }
    return $cache;
  }
}

这里应该仔细看看Drupal是如何缓存的:以URL为key,缓存内容是一个数组,包含path、body、title、headers。/drupal/index.php?q=node/1和/drupal/index.php?q=node/2这会是两个缓存项。

同一页面的后续请求如何从缓存读取呢?这就是Drupal的启动函数_drupal_bootstrap_page_cache()完成的内容:

function _drupal_bootstrap_page_cache() {
  global $user;

  // Allow specifying special cache handlers in settings.php, like
  // using memcached or files for storing cache information.
  require_once DRUPAL_ROOT . '/includes/cache.inc';
  
  // 这是什么意思?
  // 是不是说cache.inc只是实现了默认的数据库缓存DrupalDatabaseCache,
  // 如果有Memcached缓存(DrupalMemcachedCache)或者File缓存(DrupalFileCache)时,
  // 可以将这些额外的类文件在settings.php中用cache_backends声明:
  //   $conf['cache_backends'][] = 'includes/cache.memcached.inc';
  //   $conf['cache_backends'][] = 'includes/cache.file.inc';
  foreach (variable_get('cache_backends', array()) as $include) {
    require_once DRUPAL_ROOT . '/' . $include;
  }
  
  // Check for a cache mode force from settings.php.
  if (variable_get('page_cache_without_database')) {
    $cache_enabled = TRUE;
  }
  else {
    drupal_bootstrap(DRUPAL_BOOTSTRAP_VARIABLES, FALSE);
    $cache_enabled = variable_get('cache');
  }
  
  drupal_block_denied(ip_address());
  
  // If there is no session cookie and cache is enabled (or forced), try
  // to serve a cached page.
  if (!isset($_COOKIE[session_name()]) && $cache_enabled) {
    // Make sure there is a user object because its timestamp will be
    // checked, hook_boot might check for anonymous user etc.
    $user = drupal_anonymous_user(); //匿名用户
    // Get the page from the cache.
    $cache = drupal_page_get_cache();
    // If there is a cached page, display it.
    if (is_object($cache)) {
      header('X-Drupal-Cache: HIT');
      // Restore the metadata cached with the page.
      $_GET['q'] = $cache->data['path'];
      drupal_set_title($cache->data['title'], PASS_THROUGH);
      date_default_timezone_set(drupal_get_user_timezone());
      // If the skipping of the bootstrap hooks is not enforced, call
      // hook_boot.
      if (variable_get('page_cache_invoke_hooks', TRUE)) {
        bootstrap_invoke_all('boot');
      }
      drupal_serve_page_from_cache($cache);
      // If the skipping of the bootstrap hooks is not enforced, call
      // hook_exit.
      if (variable_get('page_cache_invoke_hooks', TRUE)) {
        bootstrap_invoke_all('exit');
      }
      // We are done.
      exit;
    }
    else {
      header('X-Drupal-Cache: MISS');
    }
  }
}

首先,检查一下请求者的IP地址是否允许:

drupal_block_denied(ip_address());

如果请求者的IP地址是禁止的,则Drupal会请求者发送403信息:

function drupal_block_denied($ip) {
  // Deny access to blocked IP addresses - t() is not yet available.
  if (drupal_is_denied($ip)) {
    header($_SERVER['SERVER_PROTOCOL'] . ' 403 Forbidden');
    print 'Sorry, ' . check_plain(ip_address()) . ' has been banned.';
    exit();
  }
}

然后,检查请求是否是匿名的。只有匿名请求才会检查缓存:

if (!isset($_COOKIE[session_name()]) /*没有会话COOKIE时才会检查缓存*/ && $cache_enabled) {
    // ......
}

匿名请求时,Drupal通过函数drupal_anonymous_user()设置$user全局变量:

function drupal_anonymous_user() {
  $user = new stdClass();
  $user->uid = 0;
  $user->hostname = ip_address();
  $user->roles = array();
  $user->roles[DRUPAL_ANONYMOUS_RID] = 'anonymous user';
  $user->cache = 0;
  return $user;
}

同时,匿名请求也会返回一个额外的HTTP头信息X-Drupal-Cache,代表当前内容时候是否是缓存返回的:

header('X-Drupal-Cache: HIT'); // 缓存命中
header('X-Drupal-Cache: MISS'); // 缓存丢失

Drupal通过函数drupal_page_set_cache()保存页面缓存,对应的,使用函数drupal_page_get_cache()获取页面缓存:

function drupal_page_get_cache($check_only = FALSE) {
  global $base_root;
  static $cache_hit = FALSE;

  if ($check_only) {
    return $cache_hit;
  }

  if (drupal_page_is_cacheable()) {
    $cache = cache_get($base_root . request_uri(), 'cache_page');
    if ($cache !== FALSE) {
      $cache_hit = TRUE;
    }
    return $cache;
  }
}

获取的缓存内容是drupal_page_set_cache()保存的数组:

$cache = drupal_page_get_cache();

// $cache = array(
//     'path' => '...',
//   'body' => '...',
//   'title' => '...',
//   'headers' => array(...),
// );

最后,透过函数drupal_serve_page_from_cache()返回缓存内容,结束请求。

function drupal_serve_page_from_cache(stdClass $cache) {
  // Negotiate whether to use compression.
  $page_compression = variable_get('page_compression', TRUE) && extension_loaded('zlib');
  $return_compressed = $page_compression && isset($_SERVER['HTTP_ACCEPT_ENCODING']) && strpos($_SERVER['HTTP_ACCEPT_ENCODING'], 'gzip') !== FALSE;

  // Get headers set in hook_boot(). Keys are lower-case.
  $hook_boot_headers = drupal_get_http_header();

  // Headers generated in this function, that may be replaced or unset using
  // drupal_add_http_headers(). Keys are mixed-case.
  $default_headers = array();

  foreach ($cache->data['headers'] as $name => $value) {
    // In the case of a 304 response, certain headers must be sent, and the
    // remaining may not (see RFC 2616, section 10.3.5). Do not override
    // headers set in hook_boot().
    $name_lower = strtolower($name);
    if (in_array($name_lower, array('content-location', 'expires', 'cache-control', 'vary')) && !isset($hook_boot_headers[$name_lower])) {
      drupal_add_http_header($name, $value);
      unset($cache->data['headers'][$name]);
    }
  }

  // If the client sent a session cookie, a cached copy will only be served
  // to that one particular client due to Vary: Cookie. Thus, do not set
  // max-age > 0, allowing the page to be cached by external proxies, when a
  // session cookie is present unless the Vary header has been replaced or
  // unset in hook_boot().
  $max_age = !isset($_COOKIE[session_name()]) || isset($hook_boot_headers['vary']) ? variable_get('page_cache_maximum_age', 0) : 0;
  $default_headers['Cache-Control'] = 'public, max-age=' . $max_age;

  // Entity tag should change if the output changes.
  $etag = '"' . $cache->created . '-' . intval($return_compressed) . '"';
  header('Etag: ' . $etag);

  // See if the client has provided the required HTTP headers.
  $if_modified_since = isset($_SERVER['HTTP_IF_MODIFIED_SINCE']) ? strtotime($_SERVER['HTTP_IF_MODIFIED_SINCE']) : FALSE;
  $if_none_match = isset($_SERVER['HTTP_IF_NONE_MATCH']) ? stripslashes($_SERVER['HTTP_IF_NONE_MATCH']) : FALSE;

  if ($if_modified_since && $if_none_match
      && $if_none_match == $etag // etag must match
      && $if_modified_since == $cache->created) {  // if-modified-since must match
    header($_SERVER['SERVER_PROTOCOL'] . ' 304 Not Modified');
    drupal_send_headers($default_headers);
    return;
  }

  // Send the remaining headers.
  foreach ($cache->data['headers'] as $name => $value) {
    drupal_add_http_header($name, $value);
  }

  $default_headers['Last-Modified'] = gmdate(DATE_RFC1123, $cache->created);

  // HTTP/1.0 proxies does not support the Vary header, so prevent any caching
  // by sending an Expires date in the past. HTTP/1.1 clients ignores the
  // Expires header if a Cache-Control: max-age= directive is specified (see RFC
  // 2616, section 14.9.3).
  $default_headers['Expires'] = 'Sun, 19 Nov 1978 05:00:00 GMT';

  drupal_send_headers($default_headers);

  // Allow HTTP proxies to cache pages for anonymous users without a session
  // cookie. The Vary header is used to indicates the set of request-header
  // fields that fully determines whether a cache is permitted to use the
  // response to reply to a subsequent request for a given URL without
  // revalidation. If a Vary header has been set in hook_boot(), it is assumed
  // that the module knows how to cache the page.
  if (!isset($hook_boot_headers['vary']) && !variable_get('omit_vary_cookie')) {
    header('Vary: Cookie');
  }

  if ($page_compression) {
    header('Vary: Accept-Encoding', FALSE);
    // If page_compression is enabled, the cache contains gzipped data.
    if ($return_compressed) {
      // $cache->data['body'] is already gzip'ed, so make sure
      // zlib.output_compression does not compress it once more.
      ini_set('zlib.output_compression', '0');
      header('Content-Encoding: gzip');
    }
    else {
      // The client does not support compression, so unzip the data in the
      // cache. Strip the gzip header and run uncompress.
      $cache->data['body'] = gzinflate(substr(substr($cache->data['body'], 10), 0, -8));
    }
  }

  // Print the page.
  print $cache->data['body'];
}
原文地址:https://www.cnblogs.com/eastson/p/3357300.html