Nutch1.6学习笔记

回 到 目 录

暑假每天傍晚或晚上更新

 

伪恋赛高

这里提供nutch1.6的src下载:

apache-nutch-1.6-src.zip
115网盘礼包码:5lbcymlo6u76
http://115.com/lb/5lbcymlo6u76

 如果不想自己编译源代码,可以直接下载我编译后的文件,包括单机版local和依赖hadoop版deploy(64位):

apache-nutch1.6-runtime.zip
115网盘礼包码:5lbcy4rl8e4l
http://115.com/lb/5lbcy4rl8e4l

或者仅下载官方的deploy版

apache-nutch-1.6-bin.tar.gz
115网盘礼包码:5lbbtpwwpbq2
http://115.com/lb/5lbbtpwwpbq2

7/20日编辑:---------------------

今天突然找到nutch各版本的下载地址:http://archive.apache.org/dist/nutch/

apache的各版本软件都可以在这里找到:http://archive.apache.org/dist/

-----------------------------------

目录

  安装nutch1.6

  使用本地nutch及命令

  Nutch的抓取周期

  域统计

  webgraph

  nodedumper和linkrank

  注入分值

  轻量级抓取freegen

  配置solr服务器

  使用Luke

  solr配置自定义分词器mmseg4j

  Luke配置mmseg4j

  solr4.2

  Cygwin安装nutch

  nutch与hadoop

  

安装nutch1.6及与hadoop1.0.3连接的入门

 http://wxweven.blog.163.com/blog/static/1974791152014127115626958/

使用本地nutch及命令

在runtime文件夹中,local文件夹是不借助hadoop的nutch,在该文件夹中实现了单机mapreduce。

本地nutch一般用来做测试、调试。进入local文件夹

在conf文件夹中有很多配置nutch的文件,nutch-default.xml是默认配置,里面有很多配置的

说明。nutch-site.xml是最主要的配置,它会覆盖default中的内容。

在运行nutch前先在nutch-site.xml加入http.agent.name配置。

default中的http.agent.name的例子如下:

 1 <property>
 2   <name>http.agent.name</name>
 3   <value></value>
 4   <description>HTTP 'User-Agent' request header. MUST NOT be empty - 
 5   please set this to a single word uniquely related to your organization.
 6 
 7   NOTE: You should also check other related properties:
 8 
 9     http.robots.agents
10     http.agent.description
11     http.agent.url
12     http.agent.email
13     http.agent.version
14 
15   and set their values appropriately.
16 
17   </description>
18 </property>
View Code

考到site.xml,在value标签中加入请求头,这个请求头需要在浏览器中提取,

比如火狐的请求头是

Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/28.0.1500.72 Safari/537.36

这里是我的nutch-site.xml的完整内容:

 1 <?xml version="1.0"?>
 2 <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
 3 
 4 <!-- Put site-specific property overrides in this file. -->
 5 
 6 <configuration>
 7     <property>
 8                 <name>http.agent.name</name>
 9                 <value>Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/28.0.1500.72 Safari/537.36</value>
10         </property> 
11 
12 </configuration>
View Code

修改好配置之后就能做实验了。

运行bin中的natch程序,提示要输入命令

以下内容部分转自:http://www.blogjava.net/kxx129/archive/2009/09/05/294000.html

Crawl(爬行):  Crawl是“org.apache.nutch.crawl.Crawl”的别称,它是一个完整的爬取和索引过程命令。 

      使用方法: 
      Shell代码 
      bin/nutch crawl <urlDir> [-dir d] [-threads n] [-depth i] [-topN] 

      bin/nutch crawl <urlDir> [-dir d] [-threads n] [-depth i] [-topN]

        参数说明: 
        <urlDir>:包括URL列表的文本文件,它是一个已存在的文件夹。 
        [-dir <d>]:Nutch保存爬取记录的工作目录,默认情况下值为:./crawl-[date],其中[date]为当前目期。 
        [-threads <n>]:Fetcher线程数,覆盖默认配置文件中的fetcher.threads.fetch值(默认为10)。 
        [-depth <i>]:Nutch爬虫迭代的深度,默认值为5。 
          [-topN <num>]:限制每一次迭代中的前N条记录,默认值为 Integer.MAX_VALUE。

     例子1:./bin/nutch crawl urls -dir data -threads 50 -depth 2 -topN 2(先不运行这个命令)

        要抓取的网址存放在urls文件夹中(nutch要从urls中的文件读出来),

        抓取后的数据放在data中,

        使用50个线程来抓取,迭代深度为2,每次迭代抓前2条记录

        值得注意的是nutch为了优化效率,不会严格按照深度优先搜索或广度优先搜索来查

      例子2: nohup ./bin/nutch crawl urls -dir data -threads 50 -depth 2 -topN 2 &

          在前边加了一个nohup, nutch会把日志写到当前目录的nohup.out中

          (更详细的日志文件在logs/hadoop.log中)

          在后边加了一个&,这是linux的后台运行的命令

          如果出错了,在nohup.out中可以看到类似与java异常的日志

          测试的时候,先生成urls文件夹,然后在里面生成url.txt

          写入http://blog.tianya.cn/,表示抓取天涯博客。

          url.txt不是固定的,你可以改成其他名,在urls中的所有文件都将被看作是装作url的文件被读取

          可以去掉-topN,这样抓取前两层所有的url,博主试了一下,某些网页超久的,最好不要去掉

          

          运行例子2,而一会儿后,抓取完毕,以下是我抓取的nohup.out日志,没有显示异常:

  1 solrUrl is not set, indexing will be skipped...
  2 crawl started in: data
  3 rootUrlDir = urls
  4 threads = 50
  5 depth = 2
  6 solrUrl=null
  7 topN = 2
  8 Injector: starting at 2014-07-13 20:37:26
  9 Injector: crawlDb: data/crawldb
 10 Injector: urlDir: urls
 11 Injector: Converting injected urls to crawl db entries.
 12 Injector: total number of urls rejected by filters: 1
 13 Injector: total number of urls injected after normalization and filtering: 2
 14 Injector: Merging injected urls into crawl db.
 15 Injector: finished at 2014-07-13 20:37:57, elapsed: 00:00:30
 16 Generator: starting at 2014-07-13 20:37:57
 17 Generator: Selecting best-scoring urls due for fetch.
 18 Generator: filtering: true
 19 Generator: normalizing: true
 20 Generator: topN: 2
 21 Generator: jobtracker is 'local', generating exactly one partition.
 22 Generator: Partitioning selected urls for politeness.
 23 Generator: segment: data/segments/20140713203805
 24 Generator: finished at 2014-07-13 20:38:13, elapsed: 00:00:15
 25 Fetcher: Your 'http.agent.name' value should be listed first in 'http.robots.agents' property.
 26 Fetcher: starting at 2014-07-13 20:38:13
 27 Fetcher: segment: data/segments/20140713203805
 28 Using queue mode : byHost
 29 Fetcher: threads: 50
 30 Fetcher: time-out divisor: 2
 31 QueueFeeder finished: total 1 records + hit by time limit :0
 32 Using queue mode : byHost
 33 Using queue mode : byHost
 34 Using queue mode : byHost
 35 fetching http://blog.tianya.cn/
 36 Using queue mode : byHost
 37 -finishing thread FetcherThread, activeThreads=2
 38 -finishing thread FetcherThread, activeThreads=1
 39 Using queue mode : byHost
 40 -finishing thread FetcherThread, activeThreads=1
 41 Using queue mode : byHost
 42 Using queue mode : byHost
 43 -finishing thread FetcherThread, activeThreads=1
 44 -finishing thread FetcherThread, activeThreads=1
 45 Using queue mode : byHost
 46 -finishing thread FetcherThread, activeThreads=1
 47 Using queue mode : byHost
 48 -finishing thread FetcherThread, activeThreads=1
 49 Using queue mode : byHost
 50 -finishing thread FetcherThread, activeThreads=1
 51 Using queue mode : byHost
 52 Using queue mode : byHost
 53 -finishing thread FetcherThread, activeThreads=1
 54 -finishing thread FetcherThread, activeThreads=1
 55 -finishing thread FetcherThread, activeThreads=1
 56 Using queue mode : byHost
 57 Using queue mode : byHost
 58 -finishing thread FetcherThread, activeThreads=1
 59 -finishing thread FetcherThread, activeThreads=1
 60 Using queue mode : byHost
 61 Using queue mode : byHost
 62 -finishing thread FetcherThread, activeThreads=1
 63 Using queue mode : byHost
 64 -finishing thread FetcherThread, activeThreads=1
 65 Using queue mode : byHost
 66 -finishing thread FetcherThread, activeThreads=1
 67 Using queue mode : byHost
 68 -finishing thread FetcherThread, activeThreads=1
 69 Using queue mode : byHost
 70 -finishing thread FetcherThread, activeThreads=1
 71 Using queue mode : byHost
 72 -finishing thread FetcherThread, activeThreads=1
 73 Using queue mode : byHost
 74 -finishing thread FetcherThread, activeThreads=1
 75 -finishing thread FetcherThread, activeThreads=1
 76 Using queue mode : byHost
 77 Using queue mode : byHost
 78 -finishing thread FetcherThread, activeThreads=1
 79 Using queue mode : byHost
 80 -finishing thread FetcherThread, activeThreads=1
 81 Using queue mode : byHost
 82 Using queue mode : byHost
 83 -finishing thread FetcherThread, activeThreads=1
 84 -finishing thread FetcherThread, activeThreads=1
 85 Using queue mode : byHost
 86 Using queue mode : byHost
 87 -finishing thread FetcherThread, activeThreads=1
 88 -finishing thread FetcherThread, activeThreads=1
 89 Using queue mode : byHost
 90 Using queue mode : byHost
 91 -finishing thread FetcherThread, activeThreads=1
 92 -finishing thread FetcherThread, activeThreads=1
 93 Using queue mode : byHost
 94 Using queue mode : byHost
 95 -finishing thread FetcherThread, activeThreads=1
 96 -finishing thread FetcherThread, activeThreads=1
 97 Using queue mode : byHost
 98 Using queue mode : byHost
 99 -finishing thread FetcherThread, activeThreads=1
100 Using queue mode : byHost
101 -finishing thread FetcherThread, activeThreads=1
102 -finishing thread FetcherThread, activeThreads=1
103 Using queue mode : byHost
104 Using queue mode : byHost
105 -finishing thread FetcherThread, activeThreads=1
106 -finishing thread FetcherThread, activeThreads=1
107 Using queue mode : byHost
108 Using queue mode : byHost
109 -finishing thread FetcherThread, activeThreads=1
110 Using queue mode : byHost
111 -finishing thread FetcherThread, activeThreads=1
112 -finishing thread FetcherThread, activeThreads=1
113 Using queue mode : byHost
114 -finishing thread FetcherThread, activeThreads=1
115 Using queue mode : byHost
116 -finishing thread FetcherThread, activeThreads=1
117 Using queue mode : byHost
118 Using queue mode : byHost
119 -finishing thread FetcherThread, activeThreads=1
120 -finishing thread FetcherThread, activeThreads=1
121 Using queue mode : byHost
122 -finishing thread FetcherThread, activeThreads=1
123 Using queue mode : byHost
124 -finishing thread FetcherThread, activeThreads=1
125 Using queue mode : byHost
126 -finishing thread FetcherThread, activeThreads=1
127 Using queue mode : byHost
128 -finishing thread FetcherThread, activeThreads=1
129 -finishing thread FetcherThread, activeThreads=1
130 Using queue mode : byHost
131 Fetcher: throughput threshold: -1
132 Fetcher: throughput threshold retries: 5
133 -finishing thread FetcherThread, activeThreads=1
134 -activeThreads=1, spinWaiting=0, fetchQueues.totalSize=0
135 -activeThreads=1, spinWaiting=0, fetchQueues.totalSize=0
136 -activeThreads=1, spinWaiting=0, fetchQueues.totalSize=0
137 -activeThreads=1, spinWaiting=0, fetchQueues.totalSize=0
138 -activeThreads=1, spinWaiting=0, fetchQueues.totalSize=0
139 -activeThreads=1, spinWaiting=0, fetchQueues.totalSize=0
140 -activeThreads=1, spinWaiting=0, fetchQueues.totalSize=0
141 -finishing thread FetcherThread, activeThreads=0
142 -activeThreads=0, spinWaiting=0, fetchQueues.totalSize=0
143 -activeThreads=0
144 Fetcher: finished at 2014-07-13 20:38:26, elapsed: 00:00:13
145 ParseSegment: starting at 2014-07-13 20:38:26
146 ParseSegment: segment: data/segments/20140713203805
147 Parsed (64ms):http://blog.tianya.cn/
148 ParseSegment: finished at 2014-07-13 20:38:33, elapsed: 00:00:07
149 CrawlDb update: starting at 2014-07-13 20:38:33
150 CrawlDb update: db: data/crawldb
151 CrawlDb update: segments: [data/segments/20140713203805]
152 CrawlDb update: additions allowed: true
153 CrawlDb update: URL normalizing: true
154 CrawlDb update: URL filtering: true
155 CrawlDb update: 404 purging: false
156 CrawlDb update: Merging segment data into db.
157 CrawlDb update: finished at 2014-07-13 20:38:47, elapsed: 00:00:13
158 Generator: starting at 2014-07-13 20:38:47
159 Generator: Selecting best-scoring urls due for fetch.
160 Generator: filtering: true
161 Generator: normalizing: true
162 Generator: topN: 2
163 Generator: jobtracker is 'local', generating exactly one partition.
164 Generator: Partitioning selected urls for politeness.
165 Generator: segment: data/segments/20140713203855
166 Generator: finished at 2014-07-13 20:39:02, elapsed: 00:00:15
167 Fetcher: Your 'http.agent.name' value should be listed first in 'http.robots.agents' property.
168 Fetcher: starting at 2014-07-13 20:39:02
169 Fetcher: segment: data/segments/20140713203855
170 Using queue mode : byHost
171 Fetcher: threads: 50
172 Fetcher: time-out divisor: 2
173 QueueFeeder finished: total 2 records + hit by time limit :0
174 Using queue mode : byHost
175 Using queue mode : byHost
176 fetching http://blog.tianya.cn/blog/culture
177 Using queue mode : byHost
178 Using queue mode : byHost
179 Using queue mode : byHost
180 Using queue mode : byHost
181 Using queue mode : byHost
182 Using queue mode : byHost
183 Using queue mode : byHost
184 Using queue mode : byHost
185 Using queue mode : byHost
186 Using queue mode : byHost
187 Using queue mode : byHost
188 Using queue mode : byHost
189 Using queue mode : byHost
190 Using queue mode : byHost
191 Using queue mode : byHost
192 Using queue mode : byHost
193 Using queue mode : byHost
194 Using queue mode : byHost
195 Using queue mode : byHost
196 Using queue mode : byHost
197 Using queue mode : byHost
198 Using queue mode : byHost
199 Using queue mode : byHost
200 Using queue mode : byHost
201 Using queue mode : byHost
202 Using queue mode : byHost
203 Using queue mode : byHost
204 Using queue mode : byHost
205 Using queue mode : byHost
206 Using queue mode : byHost
207 Using queue mode : byHost
208 Using queue mode : byHost
209 Using queue mode : byHost
210 Using queue mode : byHost
211 Using queue mode : byHost
212 Using queue mode : byHost
213 Using queue mode : byHost
214 Using queue mode : byHost
215 Using queue mode : byHost
216 Using queue mode : byHost
217 Using queue mode : byHost
218 Using queue mode : byHost
219 Using queue mode : byHost
220 Using queue mode : byHost
221 Using queue mode : byHost
222 Using queue mode : byHost
223 Using queue mode : byHost
224 Using queue mode : byHost
225 Fetcher: throughput threshold: -1
226 Fetcher: throughput threshold retries: 5
227 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=1
228 * queue: http://blog.tianya.cn
229   maxThreads    = 1
230   inProgress    = 1
231   crawlDelay    = 5000
232   minCrawlDelay = 0
233   nextFetchTime = 1405255142707
234   now           = 1405255143834
235   0. http://blog.tianya.cn/blog/daren
236 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=1
237 * queue: http://blog.tianya.cn
238   maxThreads    = 1
239   inProgress    = 1
240   crawlDelay    = 5000
241   minCrawlDelay = 0
242   nextFetchTime = 1405255142707
243   now           = 1405255144838
244   0. http://blog.tianya.cn/blog/daren
245 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=1
246 * queue: http://blog.tianya.cn
247   maxThreads    = 1
248   inProgress    = 1
249   crawlDelay    = 5000
250   minCrawlDelay = 0
251   nextFetchTime = 1405255142707
252   now           = 1405255145841
253   0. http://blog.tianya.cn/blog/daren
254 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=1
255 * queue: http://blog.tianya.cn
256   maxThreads    = 1
257   inProgress    = 0
258   crawlDelay    = 5000
259   minCrawlDelay = 0
260   nextFetchTime = 1405255151041
261   now           = 1405255146844
262   0. http://blog.tianya.cn/blog/daren
263 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=1
264 * queue: http://blog.tianya.cn
265   maxThreads    = 1
266   inProgress    = 0
267   crawlDelay    = 5000
268   minCrawlDelay = 0
269   nextFetchTime = 1405255151041
270   now           = 1405255147847
271   0. http://blog.tianya.cn/blog/daren
272 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=1
273 * queue: http://blog.tianya.cn
274   maxThreads    = 1
275   inProgress    = 0
276   crawlDelay    = 5000
277   minCrawlDelay = 0
278   nextFetchTime = 1405255151041
279   now           = 1405255148852
280   0. http://blog.tianya.cn/blog/daren
281 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=1
282 * queue: http://blog.tianya.cn
283   maxThreads    = 1
284   inProgress    = 0
285   crawlDelay    = 5000
286   minCrawlDelay = 0
287   nextFetchTime = 1405255151041
288   now           = 1405255149855
289   0. http://blog.tianya.cn/blog/daren
290 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=1
291 * queue: http://blog.tianya.cn
292   maxThreads    = 1
293   inProgress    = 0
294   crawlDelay    = 5000
295   minCrawlDelay = 0
296   nextFetchTime = 1405255151041
297   now           = 1405255150858
298   0. http://blog.tianya.cn/blog/daren
299 fetching http://blog.tianya.cn/blog/daren
300 -finishing thread FetcherThread, activeThreads=49
301 -finishing thread FetcherThread, activeThreads=48
302 -finishing thread FetcherThread, activeThreads=46
303 -finishing thread FetcherThread, activeThreads=47
304 -finishing thread FetcherThread, activeThreads=45
305 -finishing thread FetcherThread, activeThreads=44
306 -finishing thread FetcherThread, activeThreads=43
307 -finishing thread FetcherThread, activeThreads=41
308 -finishing thread FetcherThread, activeThreads=42
309 -finishing thread FetcherThread, activeThreads=40
310 -finishing thread FetcherThread, activeThreads=39
311 -finishing thread FetcherThread, activeThreads=38
312 -finishing thread FetcherThread, activeThreads=37
313 -finishing thread FetcherThread, activeThreads=36
314 -finishing thread FetcherThread, activeThreads=35
315 -finishing thread FetcherThread, activeThreads=34
316 -finishing thread FetcherThread, activeThreads=33
317 -finishing thread FetcherThread, activeThreads=32
318 -finishing thread FetcherThread, activeThreads=31
319 -finishing thread FetcherThread, activeThreads=30
320 -finishing thread FetcherThread, activeThreads=29
321 -finishing thread FetcherThread, activeThreads=28
322 -finishing thread FetcherThread, activeThreads=27
323 -finishing thread FetcherThread, activeThreads=26
324 -finishing thread FetcherThread, activeThreads=25
325 -finishing thread FetcherThread, activeThreads=24
326 -finishing thread FetcherThread, activeThreads=23
327 -finishing thread FetcherThread, activeThreads=22
328 -finishing thread FetcherThread, activeThreads=21
329 -finishing thread FetcherThread, activeThreads=20
330 -finishing thread FetcherThread, activeThreads=19
331 -finishing thread FetcherThread, activeThreads=18
332 -finishing thread FetcherThread, activeThreads=17
333 -finishing thread FetcherThread, activeThreads=16
334 -finishing thread FetcherThread, activeThreads=15
335 -finishing thread FetcherThread, activeThreads=14
336 -finishing thread FetcherThread, activeThreads=13
337 -finishing thread FetcherThread, activeThreads=12
338 -finishing thread FetcherThread, activeThreads=11
339 -finishing thread FetcherThread, activeThreads=10
340 -finishing thread FetcherThread, activeThreads=9
341 -finishing thread FetcherThread, activeThreads=8
342 -finishing thread FetcherThread, activeThreads=7
343 -finishing thread FetcherThread, activeThreads=6
344 -finishing thread FetcherThread, activeThreads=5
345 -finishing thread FetcherThread, activeThreads=4
346 -finishing thread FetcherThread, activeThreads=3
347 -finishing thread FetcherThread, activeThreads=2
348 -finishing thread FetcherThread, activeThreads=1
349 -finishing thread FetcherThread, activeThreads=0
350 -activeThreads=0, spinWaiting=0, fetchQueues.totalSize=0
351 -activeThreads=0
352 Fetcher: finished at 2014-07-13 20:39:18, elapsed: 00:00:16
353 ParseSegment: starting at 2014-07-13 20:39:18
354 ParseSegment: segment: data/segments/20140713203855
355 Parsed (13ms):http://blog.tianya.cn/blog/culture
356 Parsed (3ms):http://blog.tianya.cn/blog/daren
357 ParseSegment: finished at 2014-07-13 20:39:25, elapsed: 00:00:07
358 CrawlDb update: starting at 2014-07-13 20:39:25
359 CrawlDb update: db: data/crawldb
360 CrawlDb update: segments: [data/segments/20140713203855]
361 CrawlDb update: additions allowed: true
362 CrawlDb update: URL normalizing: true
363 CrawlDb update: URL filtering: true
364 CrawlDb update: 404 purging: false
365 CrawlDb update: Merging segment data into db.
366 CrawlDb update: finished at 2014-07-13 20:39:38, elapsed: 00:00:13
367 LinkDb: starting at 2014-07-13 20:39:38
368 LinkDb: linkdb: data/linkdb
369 LinkDb: URL normalize: true
370 LinkDb: URL filter: true
371 LinkDb: internal links will be ignored.
372 LinkDb: adding segment: file:/home/lan/nutch/local/data/segments/20140713203805
373 LinkDb: adding segment: file:/home/lan/nutch/local/data/segments/20140713203855
374 LinkDb: finished at 2014-07-13 20:39:48, elapsed: 00:00:10
375 crawl finished: data
View Code

 没加-topN的日志:

   1 solrUrl is not set, indexing will be skipped...
   2 crawl started in: data
   3 rootUrlDir = urls
   4 threads = 50
   5 depth = 2
   6 solrUrl=null
   7 Injector: starting at 2014-07-14 20:52:04
   8 Injector: crawlDb: data/crawldb
   9 Injector: urlDir: urls
  10 Injector: Converting injected urls to crawl db entries.
  11 Injector: total number of urls rejected by filters: 1
  12 Injector: total number of urls injected after normalization and filtering: 2
  13 Injector: Merging injected urls into crawl db.
  14 Injector: finished at 2014-07-14 20:52:23, elapsed: 00:00:19
  15 Generator: starting at 2014-07-14 20:52:23
  16 Generator: Selecting best-scoring urls due for fetch.
  17 Generator: filtering: true
  18 Generator: normalizing: true
  19 Generator: jobtracker is 'local', generating exactly one partition.
  20 Generator: Partitioning selected urls for politeness.
  21 Generator: segment: data/segments/20140714205231
  22 Generator: finished at 2014-07-14 20:52:39, elapsed: 00:00:15
  23 Fetcher: Your 'http.agent.name' value should be listed first in 'http.robots.agents' property.
  24 Fetcher: starting at 2014-07-14 20:52:39
  25 Fetcher: segment: data/segments/20140714205231
  26 Using queue mode : byHost
  27 Fetcher: threads: 50
  28 Fetcher: time-out divisor: 2
  29 QueueFeeder finished: total 1 records + hit by time limit :0
  30 Using queue mode : byHost
  31 Using queue mode : byHost
  32 Using queue mode : byHost
  33 fetching http://blog.tianya.cn/
  34 Using queue mode : byHost
  35 -finishing thread FetcherThread, activeThreads=2
  36 -finishing thread FetcherThread, activeThreads=1
  37 Using queue mode : byHost
  38 -finishing thread FetcherThread, activeThreads=1
  39 Using queue mode : byHost
  40 -finishing thread FetcherThread, activeThreads=1
  41 Using queue mode : byHost
  42 -finishing thread FetcherThread, activeThreads=1
  43 Using queue mode : byHost
  44 -finishing thread FetcherThread, activeThreads=1
  45 Using queue mode : byHost
  46 -finishing thread FetcherThread, activeThreads=1
  47 Using queue mode : byHost
  48 -finishing thread FetcherThread, activeThreads=1
  49 Using queue mode : byHost
  50 -finishing thread FetcherThread, activeThreads=1
  51 Using queue mode : byHost
  52 -finishing thread FetcherThread, activeThreads=1
  53 Using queue mode : byHost
  54 Using queue mode : byHost
  55 -finishing thread FetcherThread, activeThreads=1
  56 Using queue mode : byHost
  57 -finishing thread FetcherThread, activeThreads=1
  58 -finishing thread FetcherThread, activeThreads=1
  59 Using queue mode : byHost
  60 -finishing thread FetcherThread, activeThreads=1
  61 Using queue mode : byHost
  62 -finishing thread FetcherThread, activeThreads=1
  63 Using queue mode : byHost
  64 Using queue mode : byHost
  65 -finishing thread FetcherThread, activeThreads=1
  66 Using queue mode : byHost
  67 -finishing thread FetcherThread, activeThreads=1
  68 -finishing thread FetcherThread, activeThreads=1
  69 Using queue mode : byHost
  70 -finishing thread FetcherThread, activeThreads=1
  71 Using queue mode : byHost
  72 Using queue mode : byHost
  73 -finishing thread FetcherThread, activeThreads=1
  74 Using queue mode : byHost
  75 -finishing thread FetcherThread, activeThreads=1
  76 -finishing thread FetcherThread, activeThreads=1
  77 Using queue mode : byHost
  78 -finishing thread FetcherThread, activeThreads=1
  79 Using queue mode : byHost
  80 Using queue mode : byHost
  81 -finishing thread FetcherThread, activeThreads=1
  82 -finishing thread FetcherThread, activeThreads=1
  83 Using queue mode : byHost
  84 -finishing thread FetcherThread, activeThreads=1
  85 Using queue mode : byHost
  86 Using queue mode : byHost
  87 -finishing thread FetcherThread, activeThreads=1
  88 -finishing thread FetcherThread, activeThreads=1
  89 Using queue mode : byHost
  90 Using queue mode : byHost
  91 -finishing thread FetcherThread, activeThreads=1
  92 -finishing thread FetcherThread, activeThreads=1
  93 Using queue mode : byHost
  94 Using queue mode : byHost
  95 -finishing thread FetcherThread, activeThreads=1
  96 Using queue mode : byHost
  97 -finishing thread FetcherThread, activeThreads=1
  98 -finishing thread FetcherThread, activeThreads=1
  99 Using queue mode : byHost
 100 Using queue mode : byHost
 101 -finishing thread FetcherThread, activeThreads=1
 102 -finishing thread FetcherThread, activeThreads=1
 103 Using queue mode : byHost
 104 -finishing thread FetcherThread, activeThreads=1
 105 Using queue mode : byHost
 106 Using queue mode : byHost
 107 -finishing thread FetcherThread, activeThreads=1
 108 Using queue mode : byHost
 109 -finishing thread FetcherThread, activeThreads=1
 110 Using queue mode : byHost
 111 -finishing thread FetcherThread, activeThreads=1
 112 Using queue mode : byHost
 113 -finishing thread FetcherThread, activeThreads=1
 114 Using queue mode : byHost
 115 -finishing thread FetcherThread, activeThreads=1
 116 Using queue mode : byHost
 117 -finishing thread FetcherThread, activeThreads=1
 118 -finishing thread FetcherThread, activeThreads=1
 119 Using queue mode : byHost
 120 -finishing thread FetcherThread, activeThreads=1
 121 Using queue mode : byHost
 122 -finishing thread FetcherThread, activeThreads=1
 123 Using queue mode : byHost
 124 -finishing thread FetcherThread, activeThreads=1
 125 Using queue mode : byHost
 126 -finishing thread FetcherThread, activeThreads=1
 127 Using queue mode : byHost
 128 -finishing thread FetcherThread, activeThreads=1
 129 Fetcher: throughput threshold: -1
 130 Fetcher: throughput threshold retries: 5
 131 -finishing thread FetcherThread, activeThreads=1
 132 -activeThreads=1, spinWaiting=0, fetchQueues.totalSize=0
 133 -activeThreads=1, spinWaiting=0, fetchQueues.totalSize=0
 134 -activeThreads=1, spinWaiting=0, fetchQueues.totalSize=0
 135 -activeThreads=1, spinWaiting=0, fetchQueues.totalSize=0
 136 -activeThreads=1, spinWaiting=0, fetchQueues.totalSize=0
 137 -activeThreads=1, spinWaiting=0, fetchQueues.totalSize=0
 138 -activeThreads=1, spinWaiting=0, fetchQueues.totalSize=0
 139 -activeThreads=1, spinWaiting=0, fetchQueues.totalSize=0
 140 -activeThreads=1, spinWaiting=0, fetchQueues.totalSize=0
 141 -activeThreads=1, spinWaiting=0, fetchQueues.totalSize=0
 142 -activeThreads=1, spinWaiting=0, fetchQueues.totalSize=0
 143 -activeThreads=1, spinWaiting=0, fetchQueues.totalSize=0
 144 -activeThreads=1, spinWaiting=0, fetchQueues.totalSize=0
 145 -activeThreads=1, spinWaiting=0, fetchQueues.totalSize=0
 146 -finishing thread FetcherThread, activeThreads=0
 147 -activeThreads=0, spinWaiting=0, fetchQueues.totalSize=0
 148 -activeThreads=0
 149 Fetcher: finished at 2014-07-14 20:53:01, elapsed: 00:00:22
 150 ParseSegment: starting at 2014-07-14 20:53:01
 151 ParseSegment: segment: data/segments/20140714205231
 152 Parsed (47ms):http://blog.tianya.cn/
 153 ParseSegment: finished at 2014-07-14 20:53:08, elapsed: 00:00:07
 154 CrawlDb update: starting at 2014-07-14 20:53:08
 155 CrawlDb update: db: data/crawldb
 156 CrawlDb update: segments: [data/segments/20140714205231]
 157 CrawlDb update: additions allowed: true
 158 CrawlDb update: URL normalizing: true
 159 CrawlDb update: URL filtering: true
 160 CrawlDb update: 404 purging: false
 161 CrawlDb update: Merging segment data into db.
 162 CrawlDb update: finished at 2014-07-14 20:53:22, elapsed: 00:00:13
 163 Generator: starting at 2014-07-14 20:53:22
 164 Generator: Selecting best-scoring urls due for fetch.
 165 Generator: filtering: true
 166 Generator: normalizing: true
 167 Generator: jobtracker is 'local', generating exactly one partition.
 168 Generator: Partitioning selected urls for politeness.
 169 Generator: segment: data/segments/20140714205330
 170 Generator: finished at 2014-07-14 20:53:37, elapsed: 00:00:15
 171 Fetcher: Your 'http.agent.name' value should be listed first in 'http.robots.agents' property.
 172 Fetcher: starting at 2014-07-14 20:53:37
 173 Fetcher: segment: data/segments/20140714205330
 174 Using queue mode : byHost
 175 Fetcher: threads: 50
 176 Fetcher: time-out divisor: 2
 177 Using queue mode : byHost
 178 Using queue mode : byHost
 179 Using queue mode : byHost
 180 Using queue mode : byHost
 181 Using queue mode : byHost
 182 Using queue mode : byHost
 183 Using queue mode : byHost
 184 Using queue mode : byHost
 185 Using queue mode : byHost
 186 Using queue mode : byHost
 187 Using queue mode : byHost
 188 Using queue mode : byHost
 189 fetching http://www.tianya.cn/mobile
 190 Using queue mode : byHost
 191 Using queue mode : byHost
 192 Using queue mode : byHost
 193 Using queue mode : byHost
 194 fetching http://blog.tianya.cn/post-5010184-62889385-1.shtml
 195 QueueFeeder finished: total 100 records + hit by time limit :0
 196 Using queue mode : byHost
 197 Using queue mode : byHost
 198 Using queue mode : byHost
 199 Using queue mode : byHost
 200 Using queue mode : byHost
 201 Using queue mode : byHost
 202 Using queue mode : byHost
 203 Using queue mode : byHost
 204 Using queue mode : byHost
 205 Using queue mode : byHost
 206 Using queue mode : byHost
 207 Using queue mode : byHost
 208 Using queue mode : byHost
 209 Using queue mode : byHost
 210 Using queue mode : byHost
 211 Using queue mode : byHost
 212 Using queue mode : byHost
 213 Using queue mode : byHost
 214 Using queue mode : byHost
 215 Using queue mode : byHost
 216 Using queue mode : byHost
 217 Using queue mode : byHost
 218 Using queue mode : byHost
 219 Using queue mode : byHost
 220 Using queue mode : byHost
 221 Using queue mode : byHost
 222 Using queue mode : byHost
 223 Using queue mode : byHost
 224 Using queue mode : byHost
 225 Using queue mode : byHost
 226 Using queue mode : byHost
 227 Using queue mode : byHost
 228 Using queue mode : byHost
 229 Using queue mode : byHost
 230 Fetcher: throughput threshold: -1
 231 Fetcher: throughput threshold retries: 5
 232 -activeThreads=50, spinWaiting=48, fetchQueues.totalSize=98
 233 -activeThreads=50, spinWaiting=48, fetchQueues.totalSize=98
 234 -activeThreads=50, spinWaiting=48, fetchQueues.totalSize=98
 235 -activeThreads=50, spinWaiting=48, fetchQueues.totalSize=98
 236 -activeThreads=50, spinWaiting=48, fetchQueues.totalSize=98
 237 -activeThreads=50, spinWaiting=48, fetchQueues.totalSize=98
 238 -activeThreads=50, spinWaiting=48, fetchQueues.totalSize=98
 239 -activeThreads=50, spinWaiting=48, fetchQueues.totalSize=98
 240 -activeThreads=50, spinWaiting=48, fetchQueues.totalSize=98
 241 -activeThreads=50, spinWaiting=48, fetchQueues.totalSize=98
 242 -activeThreads=50, spinWaiting=48, fetchQueues.totalSize=98
 243 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=98
 244 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=98
 245 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=98
 246 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=98
 247 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=98
 248 fetching http://blog.tianya.cn/blog/culture
 249 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=97
 250 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=97
 251 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=97
 252 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=97
 253 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=97
 254 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=97
 255 fetching http://blog.tianya.cn/post-4487705-62917227-1.shtml
 256 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=96
 257 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=96
 258 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=96
 259 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=96
 260 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=96
 261 fetching http://blog.tianya.cn/post-1119083-62403495-1.shtml
 262 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=95
 263 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=95
 264 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=95
 265 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=95
 266 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=95
 267 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=95
 268 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=95
 269 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=95
 270 fetching http://blog.tianya.cn/blog/ent
 271 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=94
 272 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=94
 273 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=94
 274 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=94
 275 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=94
 276 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=94
 277 fetching http://blog.tianya.cn/post-4598537-62971598-1.shtml
 278 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=93
 279 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=93
 280 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=93
 281 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=93
 282 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=93
 283 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=93
 284 fetching http://blog.tianya.cn/post-5010184-62834903-1.shtml
 285 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=92
 286 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=92
 287 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=92
 288 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=92
 289 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=92
 290 fetching http://blog.tianya.cn/post-4877164-61406732-1.shtml
 291 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=91
 292 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=91
 293 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=91
 294 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=91
 295 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=91
 296 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=91
 297 fetching http://blog.tianya.cn/post-78180-59109533-1.shtml
 298 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=90
 299 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=90
 300 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=90
 301 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=90
 302 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=90
 303 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=90
 304 fetching http://blog.tianya.cn/post-4362114-63792588-1.shtml
 305 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=89
 306 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=89
 307 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=89
 308 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=89
 309 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=89
 310 fetching http://blog.tianya.cn/post-3961685-62977022-1.shtml
 311 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=88
 312 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=88
 313 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=88
 314 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=88
 315 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=88
 316 fetching http://blog.tianya.cn/post-5010184-62890806-1.shtml
 317 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=87
 318 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=87
 319 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=87
 320 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=87
 321 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=87
 322 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=87
 323 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=87
 324 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=87
 325 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=87
 326 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=87
 327 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=87
 328 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=87
 329 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=87
 330 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=87
 331 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=87
 332 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=87
 333 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=87
 334 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=87
 335 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=87
 336 fetching http://blog.tianya.cn/blog/mingbo
 337 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=86
 338 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=86
 339 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=86
 340 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=86
 341 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=86
 342 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=86
 343 fetching http://blog.tianya.cn/post-959477-62971507-1.shtml
 344 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=85
 345 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=85
 346 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=85
 347 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=85
 348 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=85
 349 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=85
 350 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=85
 351 fetching http://blog.tianya.cn/post-4562315-62807399-1.shtml
 352 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=84
 353 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=84
 354 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=84
 355 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=84
 356 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=84
 357 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=84
 358 fetching http://blog.tianya.cn/post-3941055-62934113-1.shtml
 359 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=83
 360 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=83
 361 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=83
 362 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=83
 363 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=83
 364 fetching http://blog.tianya.cn/blog/daren
 365 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=82
 366 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=82
 367 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=82
 368 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=82
 369 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=82
 370 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=82
 371 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=82
 372 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=82
 373 fetching http://blog.tianya.cn/post-1196211-63799917-1.shtml
 374 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=81
 375 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=81
 376 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=81
 377 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=81
 378 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=81
 379 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=81
 380 fetching http://blog.tianya.cn/post-196238-62376389-1.shtml
 381 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=80
 382 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=80
 383 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=80
 384 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=80
 385 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=80
 386 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=80
 387 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=80
 388 fetching http://blog.tianya.cn/post-4700528-62898660-1.shtml
 389 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=79
 390 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=79
 391 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=79
 392 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=79
 393 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=79
 394 fetching http://blog.tianya.cn/post-1119083-62958234-1.shtml
 395 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=78
 396 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=78
 397 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=78
 398 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=78
 399 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=78
 400 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=78
 401 fetching http://blog.tianya.cn/post-1671874-62898829-1.shtml
 402 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=77
 403 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=77
 404 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=77
 405 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=77
 406 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=77
 407 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=77
 408 fetching http://blog.tianya.cn/post-5010184-62313586-1.shtml
 409 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=76
 410 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=76
 411 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=76
 412 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=76
 413 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=76
 414 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=76
 415 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=76
 416 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=76
 417 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=76
 418 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=76
 419 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=76
 420 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=76
 421 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=76
 422 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=76
 423 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=76
 424 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=76
 425 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=76
 426 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=76
 427 fetching http://blog.tianya.cn/post-4598537-62379563-1.shtml
 428 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=75
 429 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=75
 430 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=75
 431 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=75
 432 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=75
 433 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=75
 434 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=75
 435 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=75
 436 fetching http://blog.tianya.cn/post-236764-59417277-1.shtml
 437 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=74
 438 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=74
 439 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=74
 440 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=74
 441 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=74
 442 fetching http://blog.tianya.cn/post-4360774-62845782-1.shtml
 443 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=73
 444 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=73
 445 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=73
 446 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=73
 447 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=73
 448 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=73
 449 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=73
 450 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=73
 451 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=73
 452 fetching http://blog.tianya.cn/post-196238-61158698-1.shtml
 453 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=72
 454 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=72
 455 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=72
 456 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=72
 457 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=72
 458 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=72
 459 fetching http://blog.tianya.cn/post-3340761-62357537-1.shtml
 460 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=71
 461 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=71
 462 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=71
 463 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=71
 464 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=71
 465 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=71
 466 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=71
 467 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=71
 468 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=71
 469 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=71
 470 fetching http://blog.tianya.cn/post-4562315-62367801-1.shtml
 471 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=70
 472 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=70
 473 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=70
 474 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=70
 475 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=70
 476 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=70
 477 fetching http://blog.tianya.cn/post-38484-61144592-1.shtml
 478 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=69
 479 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=69
 480 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=69
 481 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=69
 482 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=69
 483 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=69
 484 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=69
 485 fetching http://blog.tianya.cn/post-4487705-63000074-1.shtml
 486 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=68
 487 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=68
 488 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=68
 489 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=68
 490 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=68
 491 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=68
 492 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=68
 493 fetching http://blog.tianya.cn/post-3941055-62972581-1.shtml
 494 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=67
 495 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=67
 496 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=67
 497 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=67
 498 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=67
 499 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=67
 500 fetching http://blog.tianya.cn/post-2066284-62926321-1.shtml
 501 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=66
 502 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=66
 503 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=66
 504 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=66
 505 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=66
 506 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=66
 507 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=66
 508 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=66
 509 fetching http://blog.tianya.cn/post-4608093-62651701-1.shtml
 510 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=65
 511 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=65
 512 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=65
 513 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=65
 514 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=65
 515 fetching http://blog.tianya.cn/post-236764-60248116-1.shtml
 516 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=64
 517 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=64
 518 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=64
 519 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=64
 520 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=64
 521 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=64
 522 fetching http://blog.tianya.cn/post-5010184-62718271-1.shtml
 523 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=63
 524 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=63
 525 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=63
 526 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=63
 527 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=63
 528 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=63
 529 fetching http://blog.tianya.cn/post-234213-62960519-1.shtml
 530 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=62
 531 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=62
 532 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=62
 533 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=62
 534 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=62
 535 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=62
 536 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=62
 537 fetching http://blog.tianya.cn/post-4600300-62374308-1.shtml
 538 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=61
 539 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=61
 540 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=61
 541 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=61
 542 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=61
 543 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=61
 544 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=61
 545 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=61
 546 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=61
 547 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=61
 548 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=61
 549 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=61
 550 fetching http://blog.tianya.cn/post-3739914-62875218-1.shtml
 551 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=60
 552 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=60
 553 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=60
 554 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=60
 555 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=60
 556 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=60
 557 fetching http://blog.tianya.cn/post-1119083-62979540-1.shtml
 558 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=59
 559 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=59
 560 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=59
 561 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=59
 562 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=59
 563 fetching http://blog.tianya.cn/post-3773157-62890053-1.shtml
 564 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=58
 565 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=58
 566 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=58
 567 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=58
 568 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=58
 569 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=58
 570 fetching http://blog.tianya.cn/post-4562315-62899385-1.shtml
 571 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=57
 572 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=57
 573 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=57
 574 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=57
 575 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=57
 576 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=57
 577 fetching http://blog.tianya.cn/post-2513619-62970447-1.shtml
 578 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=56
 579 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=56
 580 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=56
 581 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=56
 582 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=56
 583 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=56
 584 fetching http://blog.tianya.cn/post-4482611-62820517-1.shtml
 585 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=55
 586 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=55
 587 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=55
 588 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=55
 589 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=55
 590 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=55
 591 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=55
 592 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=55
 593 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=55
 594 fetching http://blog.tianya.cn/post-236764-58766442-1.shtml
 595 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=54
 596 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=54
 597 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=54
 598 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=54
 599 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=54
 600 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=54
 601 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=54
 602 fetching http://blog.tianya.cn/post-351212-59432160-1.shtml
 603 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=53
 604 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=53
 605 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=53
 606 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=53
 607 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=53
 608 fetching http://blog.tianya.cn/post-174091-62981677-1.shtml
 609 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=52
 610 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=52
 611 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=52
 612 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=52
 613 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=52
 614 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=52
 615 fetching http://blog.tianya.cn/post-78180-62903890-1.shtml
 616 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=51
 617 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=51
 618 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=51
 619 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=51
 620 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=51
 621 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=51
 622 fetching http://blog.tianya.cn/blog/history
 623 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=50
 624 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=50
 625 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=50
 626 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=50
 627 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=50
 628 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=50
 629 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=50
 630 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=50
 631 fetching http://blog.tianya.cn/post-1578250-62896383-1.shtml
 632 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=49
 633 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=49
 634 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=49
 635 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=49
 636 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=49
 637 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=49
 638 fetching http://blog.tianya.cn/post-196238-62190438-1.shtml
 639 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=48
 640 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=48
 641 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=48
 642 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=48
 643 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=48
 644 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=48
 645 fetching http://blog.tianya.cn/post-196238-61974722-1.shtml
 646 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=47
 647 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=47
 648 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=47
 649 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=47
 650 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=47
 651 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=47
 652 fetching http://blog.tianya.cn/post-4700528-62898663-1.shtml
 653 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=46
 654 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=46
 655 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=46
 656 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=46
 657 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=46
 658 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=46
 659 fetching http://blog.tianya.cn/post-5010184-62837336-1.shtml
 660 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=45
 661 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=45
 662 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=45
 663 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=45
 664 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=45
 665 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=45
 666 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=45
 667 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=45
 668 fetching http://blog.tianya.cn/blog/finance
 669 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=44
 670 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=44
 671 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=44
 672 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=44
 673 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=44
 674 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=44
 675 fetching http://blog.tianya.cn/post-145340-62426203-1.shtml
 676 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=43
 677 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=43
 678 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=43
 679 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=43
 680 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=43
 681 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=43
 682 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=43
 683 fetching http://blog.tianya.cn/post-1870300-63794004-1.shtml
 684 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=42
 685 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=42
 686 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=42
 687 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=42
 688 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=42
 689 fetching http://blog.tianya.cn/post-863996-62974859-1.shtml
 690 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=41
 691 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=41
 692 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=41
 693 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=41
 694 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=41
 695 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=41
 696 fetching http://blog.tianya.cn/post-3727390-62972109-1.shtml
 697 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=40
 698 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=40
 699 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=40
 700 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=40
 701 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=40
 702 fetching http://blog.tianya.cn/blog/emotion
 703 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=39
 704 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=39
 705 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=39
 706 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=39
 707 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=39
 708 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=39
 709 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=39
 710 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=39
 711 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=39
 712 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=39
 713 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=39
 714 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=39
 715 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=39
 716 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=39
 717 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=39
 718 fetching http://blog.tianya.cn/post-336487-63732130-1.shtml
 719 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=38
 720 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=38
 721 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=38
 722 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=38
 723 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=38
 724 fetching http://blog.tianya.cn/post-4025452-63785440-1.shtml
 725 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=37
 726 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=37
 727 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=37
 728 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=37
 729 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=37
 730 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=37
 731 fetching http://blog.tianya.cn/post-137239-63797690-1.shtml
 732 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=36
 733 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=36
 734 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=36
 735 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=36
 736 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=36
 737 fetching http://blog.tianya.cn/post-1838543-62970839-1.shtml
 738 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=35
 739 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=35
 740 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=35
 741 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=35
 742 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=35
 743 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=35
 744 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=35
 745 fetching http://blog.tianya.cn/blog/society
 746 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=34
 747 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=34
 748 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=34
 749 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=34
 750 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=34
 751 fetching http://blog.tianya.cn/post-542686-63799203-1.shtml
 752 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=33
 753 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=33
 754 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=33
 755 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=33
 756 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=33
 757 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=33
 758 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=33
 759 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=33
 760 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=33
 761 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=33
 762 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=33
 763 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=33
 764 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=33
 765 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=33
 766 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=33
 767 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=33
 768 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=33
 769 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=33
 770 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=33
 771 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=33
 772 fetching http://blog.tianya.cn/post-1438407-62987507-1.shtml
 773 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=32
 774 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=32
 775 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=32
 776 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=32
 777 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=32
 778 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=32
 779 fetching http://blog.tianya.cn/post-3773157-62390018-1.shtml
 780 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=31
 781 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=31
 782 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=31
 783 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=31
 784 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=31
 785 fetching http://blog.tianya.cn/post-78180-58859246-1.shtml
 786 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=30
 787 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=30
 788 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=30
 789 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=30
 790 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=30
 791 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=30
 792 fetching http://blog.tianya.cn/post-236764-62962675-1.shtml
 793 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=29
 794 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=29
 795 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=29
 796 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=29
 797 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=29
 798 fetching http://blog.tianya.cn/blog/life
 799 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=28
 800 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=28
 801 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=28
 802 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=28
 803 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=28
 804 fetching http://blog.tianya.cn/post-1883179-62390915-1.shtml
 805 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=27
 806 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=27
 807 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=27
 808 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=27
 809 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=27
 810 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=27
 811 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=27
 812 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=27
 813 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=27
 814 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=27
 815 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=27
 816 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=27
 817 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=27
 818 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=27
 819 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=27
 820 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=27
 821 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=27
 822 fetching http://blog.tianya.cn/post-4009947-62401775-1.shtml
 823 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=26
 824 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=26
 825 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=26
 826 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=26
 827 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=26
 828 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=26
 829 fetching http://blog.tianya.cn/post-4047683-63794167-1.shtml
 830 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=25
 831 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=25
 832 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=25
 833 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=25
 834 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=25
 835 fetching http://blog.tianya.cn/post-1755624-62987935-1.shtml
 836 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=24
 837 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=24
 838 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=24
 839 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=24
 840 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=24
 841 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=24
 842 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=24
 843 fetching http://blog.tianya.cn/post-5010184-62690266-1.shtml
 844 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=23
 845 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=23
 846 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=23
 847 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=23
 848 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=23
 849 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=23
 850 fetching http://blog.tianya.cn/post-4353581-62972558-1.shtml
 851 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=22
 852 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=22
 853 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=22
 854 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=22
 855 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=22
 856 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=22
 857 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=22
 858 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=22
 859 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=22
 860 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=22
 861 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=22
 862 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=22
 863 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=22
 864 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=22
 865 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=22
 866 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=22
 867 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=22
 868 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=22
 869 fetching http://blog.tianya.cn/blog/international
 870 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=21
 871 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=21
 872 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=21
 873 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=21
 874 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=21
 875 fetching http://blog.tianya.cn/post-196238-61768175-1.shtml
 876 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=20
 877 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=20
 878 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=20
 879 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=20
 880 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=20
 881 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=20
 882 fetching http://blog.tianya.cn/post-4877164-61415979-1.shtml
 883 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=19
 884 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=19
 885 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=19
 886 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=19
 887 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=19
 888 fetching http://blog.tianya.cn/post-544588-62883194-1.shtml
 889 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=18
 890 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=18
 891 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=18
 892 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=18
 893 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=18
 894 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=18
 895 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=18
 896 fetching http://blog.tianya.cn/post-4250142-62927024-1.shtml
 897 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=17
 898 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=17
 899 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=17
 900 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=17
 901 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=17
 902 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=17
 903 fetching http://blog.tianya.cn/post-78180-62980961-1.shtml
 904 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=16
 905 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=16
 906 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=16
 907 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=16
 908 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=16
 909 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=16
 910 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=16
 911 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=16
 912 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=16
 913 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=16
 914 fetching http://blog.tianya.cn/post-4353581-62972544-1.shtml
 915 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=15
 916 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=15
 917 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=15
 918 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=15
 919 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=15
 920 fetching http://blog.tianya.cn/blog/stock
 921 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=14
 922 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=14
 923 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=14
 924 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=14
 925 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=14
 926 fetching http://blog.tianya.cn/post-4482611-62391796-1.shtml
 927 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=13
 928 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=13
 929 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=13
 930 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=13
 931 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=13
 932 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=13
 933 fetching http://blog.tianya.cn/post-4482611-62900444-1.shtml
 934 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=12
 935 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=12
 936 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=12
 937 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=12
 938 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=12
 939 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=12
 940 fetching http://blog.tianya.cn/blog/sports
 941 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=11
 942 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=11
 943 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=11
 944 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=11
 945 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=11
 946 fetching http://blog.tianya.cn/post-1882702-63776337-1.shtml
 947 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=10
 948 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=10
 949 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=10
 950 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=10
 951 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=10
 952 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=10
 953 fetching http://blog.tianya.cn/post-3773157-62958131-1.shtml
 954 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=9
 955 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=9
 956 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=9
 957 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=9
 958 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=9
 959 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=9
 960 fetching http://blog.tianya.cn/post-4101233-62653750-1.shtml
 961 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=8
 962 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=8
 963 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=8
 964 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=8
 965 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=8
 966 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=8
 967 fetching http://blog.tianya.cn/blog/newPush
 968 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=7
 969 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=7
 970 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=7
 971 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=7
 972 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=7
 973 fetching http://blog.tianya.cn/post-2111189-62899907-1.shtml
 974 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=6
 975 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=6
 976 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=6
 977 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=6
 978 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=6
 979 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=6
 980 fetching http://blog.tianya.cn/blog/food
 981 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=5
 982 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=5
 983 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=5
 984 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=5
 985 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=5
 986 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=5
 987 fetching http://blog.tianya.cn/post-1515015-63779836-1.shtml
 988 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=4
 989 * queue: http://blog.tianya.cn
 990   maxThreads    = 1
 991   inProgress    = 1
 992   crawlDelay    = 5000
 993   minCrawlDelay = 0
 994   nextFetchTime = 1405343081023
 995   now           = 1405343081793
 996   0. http://blog.tianya.cn/post-142905-62961160-1.shtml
 997   1. http://blog.tianya.cn/blog/travel
 998   2. http://blog.tianya.cn/post-4598537-62971461-1.shtml
 999   3. http://blog.tianya.cn/post-4598537-62971498-1.shtml
1000 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=4
1001 * queue: http://blog.tianya.cn
1002   maxThreads    = 1
1003   inProgress    = 0
1004   crawlDelay    = 5000
1005   minCrawlDelay = 0
1006   nextFetchTime = 1405343087577
1007   now           = 1405343082796
1008   0. http://blog.tianya.cn/post-142905-62961160-1.shtml
1009   1. http://blog.tianya.cn/blog/travel
1010   2. http://blog.tianya.cn/post-4598537-62971461-1.shtml
1011   3. http://blog.tianya.cn/post-4598537-62971498-1.shtml
1012 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=4
1013 * queue: http://blog.tianya.cn
1014   maxThreads    = 1
1015   inProgress    = 0
1016   crawlDelay    = 5000
1017   minCrawlDelay = 0
1018   nextFetchTime = 1405343087577
1019   now           = 1405343083799
1020   0. http://blog.tianya.cn/post-142905-62961160-1.shtml
1021   1. http://blog.tianya.cn/blog/travel
1022   2. http://blog.tianya.cn/post-4598537-62971461-1.shtml
1023   3. http://blog.tianya.cn/post-4598537-62971498-1.shtml
1024 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=4
1025 * queue: http://blog.tianya.cn
1026   maxThreads    = 1
1027   inProgress    = 0
1028   crawlDelay    = 5000
1029   minCrawlDelay = 0
1030   nextFetchTime = 1405343087577
1031   now           = 1405343084804
1032   0. http://blog.tianya.cn/post-142905-62961160-1.shtml
1033   1. http://blog.tianya.cn/blog/travel
1034   2. http://blog.tianya.cn/post-4598537-62971461-1.shtml
1035   3. http://blog.tianya.cn/post-4598537-62971498-1.shtml
1036 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=4
1037 * queue: http://blog.tianya.cn
1038   maxThreads    = 1
1039   inProgress    = 0
1040   crawlDelay    = 5000
1041   minCrawlDelay = 0
1042   nextFetchTime = 1405343087577
1043   now           = 1405343085806
1044   0. http://blog.tianya.cn/post-142905-62961160-1.shtml
1045   1. http://blog.tianya.cn/blog/travel
1046   2. http://blog.tianya.cn/post-4598537-62971461-1.shtml
1047   3. http://blog.tianya.cn/post-4598537-62971498-1.shtml
1048 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=4
1049 * queue: http://blog.tianya.cn
1050   maxThreads    = 1
1051   inProgress    = 0
1052   crawlDelay    = 5000
1053   minCrawlDelay = 0
1054   nextFetchTime = 1405343087577
1055   now           = 1405343086809
1056   0. http://blog.tianya.cn/post-142905-62961160-1.shtml
1057   1. http://blog.tianya.cn/blog/travel
1058   2. http://blog.tianya.cn/post-4598537-62971461-1.shtml
1059   3. http://blog.tianya.cn/post-4598537-62971498-1.shtml
1060 fetching http://blog.tianya.cn/post-142905-62961160-1.shtml
1061 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=3
1062 * queue: http://blog.tianya.cn
1063   maxThreads    = 1
1064   inProgress    = 0
1065   crawlDelay    = 5000
1066   minCrawlDelay = 0
1067   nextFetchTime = 1405343092743
1068   now           = 1405343087813
1069   0. http://blog.tianya.cn/blog/travel
1070   1. http://blog.tianya.cn/post-4598537-62971461-1.shtml
1071   2. http://blog.tianya.cn/post-4598537-62971498-1.shtml
1072 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=3
1073 * queue: http://blog.tianya.cn
1074   maxThreads    = 1
1075   inProgress    = 0
1076   crawlDelay    = 5000
1077   minCrawlDelay = 0
1078   nextFetchTime = 1405343092743
1079   now           = 1405343088816
1080   0. http://blog.tianya.cn/blog/travel
1081   1. http://blog.tianya.cn/post-4598537-62971461-1.shtml
1082   2. http://blog.tianya.cn/post-4598537-62971498-1.shtml
1083 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=3
1084 * queue: http://blog.tianya.cn
1085   maxThreads    = 1
1086   inProgress    = 0
1087   crawlDelay    = 5000
1088   minCrawlDelay = 0
1089   nextFetchTime = 1405343092743
1090   now           = 1405343089819
1091   0. http://blog.tianya.cn/blog/travel
1092   1. http://blog.tianya.cn/post-4598537-62971461-1.shtml
1093   2. http://blog.tianya.cn/post-4598537-62971498-1.shtml
1094 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=3
1095 * queue: http://blog.tianya.cn
1096   maxThreads    = 1
1097   inProgress    = 0
1098   crawlDelay    = 5000
1099   minCrawlDelay = 0
1100   nextFetchTime = 1405343092743
1101   now           = 1405343090821
1102   0. http://blog.tianya.cn/blog/travel
1103   1. http://blog.tianya.cn/post-4598537-62971461-1.shtml
1104   2. http://blog.tianya.cn/post-4598537-62971498-1.shtml
1105 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=3
1106 * queue: http://blog.tianya.cn
1107   maxThreads    = 1
1108   inProgress    = 0
1109   crawlDelay    = 5000
1110   minCrawlDelay = 0
1111   nextFetchTime = 1405343092743
1112   now           = 1405343091824
1113   0. http://blog.tianya.cn/blog/travel
1114   1. http://blog.tianya.cn/post-4598537-62971461-1.shtml
1115   2. http://blog.tianya.cn/post-4598537-62971498-1.shtml
1116 fetching http://blog.tianya.cn/blog/travel
1117 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=2
1118 * queue: http://blog.tianya.cn
1119   maxThreads    = 1
1120   inProgress    = 1
1121   crawlDelay    = 5000
1122   minCrawlDelay = 0
1123   nextFetchTime = 1405343092743
1124   now           = 1405343092826
1125   0. http://blog.tianya.cn/post-4598537-62971461-1.shtml
1126   1. http://blog.tianya.cn/post-4598537-62971498-1.shtml
1127 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=2
1128 * queue: http://blog.tianya.cn
1129   maxThreads    = 1
1130   inProgress    = 0
1131   crawlDelay    = 5000
1132   minCrawlDelay = 0
1133   nextFetchTime = 1405343098775
1134   now           = 1405343093829
1135   0. http://blog.tianya.cn/post-4598537-62971461-1.shtml
1136   1. http://blog.tianya.cn/post-4598537-62971498-1.shtml
1137 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=2
1138 * queue: http://blog.tianya.cn
1139   maxThreads    = 1
1140   inProgress    = 0
1141   crawlDelay    = 5000
1142   minCrawlDelay = 0
1143   nextFetchTime = 1405343098775
1144   now           = 1405343094833
1145   0. http://blog.tianya.cn/post-4598537-62971461-1.shtml
1146   1. http://blog.tianya.cn/post-4598537-62971498-1.shtml
1147 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=2
1148 * queue: http://blog.tianya.cn
1149   maxThreads    = 1
1150   inProgress    = 0
1151   crawlDelay    = 5000
1152   minCrawlDelay = 0
1153   nextFetchTime = 1405343098775
1154   now           = 1405343095835
1155   0. http://blog.tianya.cn/post-4598537-62971461-1.shtml
1156   1. http://blog.tianya.cn/post-4598537-62971498-1.shtml
1157 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=2
1158 * queue: http://blog.tianya.cn
1159   maxThreads    = 1
1160   inProgress    = 0
1161   crawlDelay    = 5000
1162   minCrawlDelay = 0
1163   nextFetchTime = 1405343098775
1164   now           = 1405343096838
1165   0. http://blog.tianya.cn/post-4598537-62971461-1.shtml
1166   1. http://blog.tianya.cn/post-4598537-62971498-1.shtml
1167 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=2
1168 * queue: http://blog.tianya.cn
1169   maxThreads    = 1
1170   inProgress    = 0
1171   crawlDelay    = 5000
1172   minCrawlDelay = 0
1173   nextFetchTime = 1405343098775
1174   now           = 1405343097840
1175   0. http://blog.tianya.cn/post-4598537-62971461-1.shtml
1176   1. http://blog.tianya.cn/post-4598537-62971498-1.shtml
1177 fetching http://blog.tianya.cn/post-4598537-62971461-1.shtml
1178 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=1
1179 * queue: http://blog.tianya.cn
1180   maxThreads    = 1
1181   inProgress    = 1
1182   crawlDelay    = 5000
1183   minCrawlDelay = 0
1184   nextFetchTime = 1405343098775
1185   now           = 1405343098843
1186   0. http://blog.tianya.cn/post-4598537-62971498-1.shtml
1187 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=1
1188 * queue: http://blog.tianya.cn
1189   maxThreads    = 1
1190   inProgress    = 1
1191   crawlDelay    = 5000
1192   minCrawlDelay = 0
1193   nextFetchTime = 1405343098775
1194   now           = 1405343099846
1195   0. http://blog.tianya.cn/post-4598537-62971498-1.shtml
1196 -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=1
1197 * queue: http://blog.tianya.cn
1198   maxThreads    = 1
1199   inProgress    = 1
1200   crawlDelay    = 5000
1201   minCrawlDelay = 0
1202   nextFetchTime = 1405343098775
1203   now           = 1405343100849
1204   0. http://blog.tianya.cn/post-4598537-62971498-1.shtml
1205 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=1
1206 * queue: http://blog.tianya.cn
1207   maxThreads    = 1
1208   inProgress    = 0
1209   crawlDelay    = 5000
1210   minCrawlDelay = 0
1211   nextFetchTime = 1405343105959
1212   now           = 1405343101851
1213   0. http://blog.tianya.cn/post-4598537-62971498-1.shtml
1214 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=1
1215 * queue: http://blog.tianya.cn
1216   maxThreads    = 1
1217   inProgress    = 0
1218   crawlDelay    = 5000
1219   minCrawlDelay = 0
1220   nextFetchTime = 1405343105959
1221   now           = 1405343102853
1222   0. http://blog.tianya.cn/post-4598537-62971498-1.shtml
1223 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=1
1224 * queue: http://blog.tianya.cn
1225   maxThreads    = 1
1226   inProgress    = 0
1227   crawlDelay    = 5000
1228   minCrawlDelay = 0
1229   nextFetchTime = 1405343105959
1230   now           = 1405343103855
1231   0. http://blog.tianya.cn/post-4598537-62971498-1.shtml
1232 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=1
1233 * queue: http://blog.tianya.cn
1234   maxThreads    = 1
1235   inProgress    = 0
1236   crawlDelay    = 5000
1237   minCrawlDelay = 0
1238   nextFetchTime = 1405343105959
1239   now           = 1405343104857
1240   0. http://blog.tianya.cn/post-4598537-62971498-1.shtml
1241 -activeThreads=50, spinWaiting=50, fetchQueues.totalSize=1
1242 * queue: http://blog.tianya.cn
1243   maxThreads    = 1
1244   inProgress    = 0
1245   crawlDelay    = 5000
1246   minCrawlDelay = 0
1247   nextFetchTime = 1405343105959
1248   now           = 1405343105859
1249   0. http://blog.tianya.cn/post-4598537-62971498-1.shtml
1250 fetching http://blog.tianya.cn/post-4598537-62971498-1.shtml
1251 -finishing thread FetcherThread, activeThreads=49
1252 -finishing thread FetcherThread, activeThreads=48
1253 -finishing thread FetcherThread, activeThreads=47
1254 -finishing thread FetcherThread, activeThreads=46
1255 -finishing thread FetcherThread, activeThreads=45
1256 -finishing thread FetcherThread, activeThreads=44
1257 -finishing thread FetcherThread, activeThreads=43
1258 -finishing thread FetcherThread, activeThreads=42
1259 -finishing thread FetcherThread, activeThreads=41
1260 -finishing thread FetcherThread, activeThreads=40
1261 -finishing thread FetcherThread, activeThreads=39
1262 -finishing thread FetcherThread, activeThreads=38
1263 -finishing thread FetcherThread, activeThreads=37
1264 -finishing thread FetcherThread, activeThreads=36
1265 -finishing thread FetcherThread, activeThreads=29
1266 -finishing thread FetcherThread, activeThreads=30
1267 -finishing thread FetcherThread, activeThreads=31
1268 -finishing thread FetcherThread, activeThreads=32
1269 -finishing thread FetcherThread, activeThreads=33
1270 -finishing thread FetcherThread, activeThreads=34
1271 -finishing thread FetcherThread, activeThreads=35
1272 -finishing thread FetcherThread, activeThreads=28
1273 -finishing thread FetcherThread, activeThreads=20
1274 -finishing thread FetcherThread, activeThreads=21
1275 -finishing thread FetcherThread, activeThreads=22
1276 -finishing thread FetcherThread, activeThreads=23
1277 -finishing thread FetcherThread, activeThreads=19
1278 -finishing thread FetcherThread, activeThreads=18
1279 -finishing thread FetcherThread, activeThreads=17
1280 -finishing thread FetcherThread, activeThreads=16
1281 -finishing thread FetcherThread, activeThreads=15
1282 -finishing thread FetcherThread, activeThreads=14
1283 -finishing thread FetcherThread, activeThreads=13
1284 -finishing thread FetcherThread, activeThreads=12
1285 -finishing thread FetcherThread, activeThreads=11
1286 -finishing thread FetcherThread, activeThreads=10
1287 -finishing thread FetcherThread, activeThreads=9
1288 -finishing thread FetcherThread, activeThreads=8
1289 -finishing thread FetcherThread, activeThreads=7
1290 -finishing thread FetcherThread, activeThreads=6
1291 -finishing thread FetcherThread, activeThreads=24
1292 -finishing thread FetcherThread, activeThreads=25
1293 -finishing thread FetcherThread, activeThreads=26
1294 -finishing thread FetcherThread, activeThreads=27
1295 -finishing thread FetcherThread, activeThreads=1
1296 -finishing thread FetcherThread, activeThreads=2
1297 -finishing thread FetcherThread, activeThreads=3
1298 -finishing thread FetcherThread, activeThreads=4
1299 -finishing thread FetcherThread, activeThreads=5
1300 -activeThreads=1, spinWaiting=0, fetchQueues.totalSize=0
1301 -activeThreads=1, spinWaiting=0, fetchQueues.totalSize=0
1302 -activeThreads=1, spinWaiting=0, fetchQueues.totalSize=0
1303 -activeThreads=1, spinWaiting=0, fetchQueues.totalSize=0
1304 -activeThreads=1, spinWaiting=0, fetchQueues.totalSize=0
1305 -finishing thread FetcherThread, activeThreads=0
1306 -activeThreads=0, spinWaiting=0, fetchQueues.totalSize=0
1307 -activeThreads=0
1308 Fetcher: finished at 2014-07-14 21:05:17, elapsed: 00:11:39
1309 ParseSegment: starting at 2014-07-14 21:05:17
1310 ParseSegment: segment: data/segments/20140714205330
1311 Parsed (13ms):http://blog.tianya.cn/blog/culture
1312 Parsed (2ms):http://blog.tianya.cn/blog/daren
1313 Parsed (15ms):http://blog.tianya.cn/blog/emotion
1314 Parsed (8ms):http://blog.tianya.cn/blog/ent
1315 Parsed (7ms):http://blog.tianya.cn/blog/finance
1316 Parsed (7ms):http://blog.tianya.cn/blog/food
1317 Parsed (12ms):http://blog.tianya.cn/blog/history
1318 Parsed (6ms):http://blog.tianya.cn/blog/international
1319 Parsed (6ms):http://blog.tianya.cn/blog/life
1320 Parsed (3ms):http://blog.tianya.cn/blog/mingbo
1321 Parsed (7ms):http://blog.tianya.cn/blog/newPush
1322 Parsed (16ms):http://blog.tianya.cn/blog/society
1323 Parsed (8ms):http://blog.tianya.cn/blog/sports
1324 Parsed (8ms):http://blog.tianya.cn/blog/stock
1325 Parsed (20ms):http://blog.tianya.cn/blog/travel
1326 Parsed (6ms):http://blog.tianya.cn/post-1119083-62403495-1.shtml
1327 Parsed (4ms):http://blog.tianya.cn/post-1119083-62958234-1.shtml
1328 Parsed (5ms):http://blog.tianya.cn/post-1119083-62979540-1.shtml
1329 Parsed (0ms):http://blog.tianya.cn/post-1196211-63799917-1.shtml
1330 Parsed (0ms):http://blog.tianya.cn/post-137239-63797690-1.shtml
1331 Parsed (0ms):http://blog.tianya.cn/post-1438407-62987507-1.shtml
1332 Parsed (0ms):http://blog.tianya.cn/post-145340-62426203-1.shtml
1333 Parsed (0ms):http://blog.tianya.cn/post-1515015-63779836-1.shtml
1334 Parsed (0ms):http://blog.tianya.cn/post-1578250-62896383-1.shtml
1335 Parsed (1ms):http://blog.tianya.cn/post-1671874-62898829-1.shtml
1336 Parsed (1ms):http://blog.tianya.cn/post-174091-62981677-1.shtml
1337 Parsed (0ms):http://blog.tianya.cn/post-1755624-62987935-1.shtml
1338 Parsed (0ms):http://blog.tianya.cn/post-1838543-62970839-1.shtml
1339 Parsed (1ms):http://blog.tianya.cn/post-1870300-63794004-1.shtml
1340 Parsed (1ms):http://blog.tianya.cn/post-1882702-63776337-1.shtml
1341 Parsed (1ms):http://blog.tianya.cn/post-1883179-62390915-1.shtml
1342 Parsed (0ms):http://blog.tianya.cn/post-196238-61158698-1.shtml
1343 Parsed (1ms):http://blog.tianya.cn/post-196238-61768175-1.shtml
1344 Parsed (0ms):http://blog.tianya.cn/post-196238-61974722-1.shtml
1345 Parsed (0ms):http://blog.tianya.cn/post-196238-62190438-1.shtml
1346 Parsed (1ms):http://blog.tianya.cn/post-196238-62376389-1.shtml
1347 Parsed (1ms):http://blog.tianya.cn/post-2066284-62926321-1.shtml
1348 Parsed (0ms):http://blog.tianya.cn/post-2111189-62899907-1.shtml
1349 Parsed (0ms):http://blog.tianya.cn/post-234213-62960519-1.shtml
1350 Parsed (0ms):http://blog.tianya.cn/post-236764-58766442-1.shtml
1351 Parsed (0ms):http://blog.tianya.cn/post-236764-59417277-1.shtml
1352 http://blog.tianya.cn/post-236764-60248116-1.shtml skipped. Content of size 65778 was truncated to 64957
1353 Parsed (1ms):http://blog.tianya.cn/post-236764-62962675-1.shtml
1354 Parsed (0ms):http://blog.tianya.cn/post-2513619-62970447-1.shtml
1355 Parsed (0ms):http://blog.tianya.cn/post-3340761-62357537-1.shtml
1356 Parsed (0ms):http://blog.tianya.cn/post-3727390-62972109-1.shtml
1357 Parsed (0ms):http://blog.tianya.cn/post-3739914-62875218-1.shtml
1358 Parsed (0ms):http://blog.tianya.cn/post-3773157-62390018-1.shtml
1359 Parsed (0ms):http://blog.tianya.cn/post-3773157-62890053-1.shtml
1360 Parsed (1ms):http://blog.tianya.cn/post-3773157-62958131-1.shtml
1361 http://blog.tianya.cn/post-38484-61144592-1.shtml skipped. Content of size 154978 was truncated to 64956
1362 Parsed (0ms):http://blog.tianya.cn/post-3941055-62934113-1.shtml
1363 Parsed (0ms):http://blog.tianya.cn/post-3941055-62972581-1.shtml
1364 Parsed (0ms):http://blog.tianya.cn/post-3961685-62977022-1.shtml
1365 Parsed (1ms):http://blog.tianya.cn/post-4009947-62401775-1.shtml
1366 Parsed (0ms):http://blog.tianya.cn/post-4025452-63785440-1.shtml
1367 Parsed (0ms):http://blog.tianya.cn/post-4047683-63794167-1.shtml
1368 Parsed (0ms):http://blog.tianya.cn/post-4101233-62653750-1.shtml
1369 Parsed (0ms):http://blog.tianya.cn/post-4250142-62927024-1.shtml
1370 Parsed (0ms):http://blog.tianya.cn/post-4353581-62972544-1.shtml
1371 Parsed (0ms):http://blog.tianya.cn/post-4353581-62972558-1.shtml
1372 Parsed (0ms):http://blog.tianya.cn/post-4360774-62845782-1.shtml
1373 Parsed (0ms):http://blog.tianya.cn/post-4362114-63792588-1.shtml
1374 Parsed (0ms):http://blog.tianya.cn/post-4482611-62391796-1.shtml
1375 Parsed (0ms):http://blog.tianya.cn/post-4482611-62820517-1.shtml
1376 Parsed (0ms):http://blog.tianya.cn/post-4482611-62900444-1.shtml
1377 Parsed (0ms):http://blog.tianya.cn/post-4487705-62917227-1.shtml
1378 Parsed (0ms):http://blog.tianya.cn/post-4487705-63000074-1.shtml
1379 Parsed (0ms):http://blog.tianya.cn/post-4562315-62367801-1.shtml
1380 Parsed (0ms):http://blog.tianya.cn/post-4562315-62807399-1.shtml
1381 Parsed (0ms):http://blog.tianya.cn/post-4562315-62899385-1.shtml
1382 Parsed (0ms):http://blog.tianya.cn/post-4598537-62379563-1.shtml
1383 Parsed (0ms):http://blog.tianya.cn/post-4598537-62971461-1.shtml
1384 Parsed (0ms):http://blog.tianya.cn/post-4598537-62971498-1.shtml
1385 Parsed (0ms):http://blog.tianya.cn/post-4598537-62971598-1.shtml
1386 Parsed (1ms):http://blog.tianya.cn/post-4600300-62374308-1.shtml
1387 Parsed (0ms):http://blog.tianya.cn/post-4608093-62651701-1.shtml
1388 Parsed (0ms):http://blog.tianya.cn/post-4700528-62898660-1.shtml
1389 Parsed (0ms):http://blog.tianya.cn/post-4700528-62898663-1.shtml
1390 Parsed (0ms):http://blog.tianya.cn/post-4877164-61406732-1.shtml
1391 Parsed (0ms):http://blog.tianya.cn/post-4877164-61415979-1.shtml
1392 Parsed (0ms):http://blog.tianya.cn/post-5010184-62313586-1.shtml
1393 Parsed (0ms):http://blog.tianya.cn/post-5010184-62690266-1.shtml
1394 Parsed (1ms):http://blog.tianya.cn/post-5010184-62718271-1.shtml
1395 Parsed (0ms):http://blog.tianya.cn/post-5010184-62834903-1.shtml
1396 Parsed (0ms):http://blog.tianya.cn/post-5010184-62837336-1.shtml
1397 Parsed (0ms):http://blog.tianya.cn/post-5010184-62889385-1.shtml
1398 Parsed (0ms):http://blog.tianya.cn/post-5010184-62890806-1.shtml
1399 Parsed (0ms):http://blog.tianya.cn/post-542686-63799203-1.shtml
1400 Parsed (0ms):http://blog.tianya.cn/post-544588-62883194-1.shtml
1401 Parsed (0ms):http://blog.tianya.cn/post-78180-58859246-1.shtml
1402 Parsed (0ms):http://blog.tianya.cn/post-78180-59109533-1.shtml
1403 Parsed (0ms):http://blog.tianya.cn/post-78180-62903890-1.shtml
1404 Parsed (0ms):http://blog.tianya.cn/post-78180-62980961-1.shtml
1405 Parsed (1ms):http://blog.tianya.cn/post-863996-62974859-1.shtml
1406 Parsed (1ms):http://blog.tianya.cn/post-959477-62971507-1.shtml
1407 ParseSegment: finished at 2014-07-14 21:05:30, elapsed: 00:00:13
1408 CrawlDb update: starting at 2014-07-14 21:05:30
1409 CrawlDb update: db: data/crawldb
1410 CrawlDb update: segments: [data/segments/20140714205330]
1411 CrawlDb update: additions allowed: true
1412 CrawlDb update: URL normalizing: true
1413 CrawlDb update: URL filtering: true
1414 CrawlDb update: 404 purging: false
1415 CrawlDb update: Merging segment data into db.
1416 CrawlDb update: finished at 2014-07-14 21:05:43, elapsed: 00:00:13
1417 LinkDb: starting at 2014-07-14 21:05:43
1418 LinkDb: linkdb: data/linkdb
1419 LinkDb: URL normalize: true
1420 LinkDb: URL filter: true
1421 LinkDb: internal links will be ignored.
1422 LinkDb: adding segment: file:/home/lan/nutch/local/data/segments/20140714205231
1423 LinkDb: adding segment: file:/home/lan/nutch/local/data/segments/20140714205330
1424 LinkDb: finished at 2014-07-14 21:05:53, elapsed: 00:00:10
1425 crawl finished: data
View Code

 可以看到总共用了13分钟。日志靠近最下方有parse各个网页的时间

 输入这条命令:

cat nohup.out|grep elapsed

显示信息如下:

 1 Injector: finished at 2014-07-14 20:52:23, elapsed: 00:00:19
 2 Generator: finished at 2014-07-14 20:52:39, elapsed: 00:00:15
 3 Fetcher: finished at 2014-07-14 20:53:01, elapsed: 00:00:22
 4 ParseSegment: finished at 2014-07-14 20:53:08, elapsed: 00:00:07
 5 CrawlDb update: finished at 2014-07-14 20:53:22, elapsed: 00:00:13
 6 Generator: finished at 2014-07-14 20:53:37, elapsed: 00:00:15
 7 Fetcher: finished at 2014-07-14 21:05:17, elapsed: 00:11:39
 8 ParseSegment: finished at 2014-07-14 21:05:30, elapsed: 00:00:13
 9 CrawlDb update: finished at 2014-07-14 21:05:43, elapsed: 00:00:13
10 LinkDb: finished at 2014-07-14 21:05:53, elapsed: 00:00:10
View Code

每次crawl都是从injector(注入url)开始,然后Generator(产生抓取列表),接着Fetch(抓取),然后ParseSegment(内容解析), CrawlDb update(更新CrawlDb)为一轮,最后以LinkDb结束。由于没加topN所以第二层的Fetch用了11分钟时间。

这里有一篇nutch执行crawl命令的详细步骤的文章:http://www.cnblogs.com/huligong1234/p/3515214.html

以下命令只给出例子,详细说明可参见上面的链接

readdb:  readdb命令是“org.apache.nutch.crawl.CrawlDbReader”的别称,返回或者导出Crawl数据库(crawldb)中的信息。

       例子1:./bin/nutch readdb data/crawldb -stats

          指定抓取完成后的数据在data/crawldb中

          -stats代表在java标准输出中输出信息,如url数、已抓取数、未抓取数

以下是输出信息:

 1 CrawlDb statistics start: data/crawldb
 2 Statistics for CrawlDb: data/crawldb
 3 TOTAL urls:    1469
 4 retry 0:    1469
 5 min score:    0.0
 6 avg score:    0.0017549354
 7 max score:    1.032
 8 status 1 (db_unfetched):    1368
 9 status 2 (db_fetched):    97
10 status 4 (db_redir_temp):    3
11 status 5 (db_redir_perm):    1
12 CrawlDb statistics: done
View Code

       例子2:./bin/nutch readdb data/crawldb -dump data/crawldb/crawldb_dump

           -dump把统计信息输出到后面的文件中

       例子3:./bin/nutch readdb data/crawldb -url http://zxcvbnm20111.blog.tianya.cn/

          输出 http://zxcvbnm20111.blog.tianya.cn/这个url的详细信息

          这个网页是在运行例子2的命令之后,在data/crawldb/crawldb_dump文件中找的

          信息如下:

 1 CrawlDb dump: starting
 2 CrawlDb db: data/crawldb
 3 CrawlDb dump: done
 4 lan@Ubuntu1:~/nutch/local$ 
 5 lan@Ubuntu1:~/nutch/local$ ./bin/nutch readdb data/crawldb -url http://zxcvbnm20111.blog.tianya.cn/
 6 URL: http://zxcvbnm20111.blog.tianya.cn/
 7 Version: 7
 8 Status: 1 (db_unfetched)
 9 Fetch time: Mon Jul 14 21:05:40 CST 2014
10 Modified time: Thu Jan 01 08:30:00 CST 1970
11 Retries since fetch: 0
12 Retry interval: 2592000 seconds (30 days)
13 Score: 2.9411764E-4
14 Signature: null
15 Metadata: 
View Code

      例子3:./bin/nutch readdb data/crawldb -topN 10 data/crawldb/crawldb_topN 0.5

          在data/crawldb/crawldb_topN文件中输出排名前十的且分值大>=0.5的url及其分值

readseg:  例子1:./bin/nutch readseg -dump data/segments/20140714205330 data/segments/dump -nocontent -nofetch -noparse -noparsedata -noparsetext

          查看segments产生的信息,输出到data/segments/dump文件(在参数中少了-nogenerate,就是说只写入产生segments的信息)

          如果查看fetch信息,就把-nofetch改成-nogenerate

          要查看content信息,就把-nocontent改成-nogenerate

          同理,还有parse、parsedata和parsetext,不再赘述

       

      例子2: ./bin/nutch readseg -list -dir data/segments

          以列表的方式显示每次产生的segments

      例子3: ./bin/nutch readseg -get data/segments/20140714205231 http://blog.tianya.cn/

          显示某个segments的信息,哇塞,有一大堆html代码和内容~

readlinkdb:  例子1:./bin/nutch readlinkdb data/linkdb -dump data/linkdb/dump

            将linkdb的信息dump到data/linkdb/dump文件里

         例子2:./bin/nutch readlinkdb data/linkdb -url http://cnrdn.com/4NJC

            查看某具体url。这个url是我在上面的dump文件中复制出来的

            结果将会产生和dump文件中该url下面几行一样的文字

Nutch的抓取周期

generate -> fetch -> parse -> update db

实际上,crawl命令等于inject命令+generate命令+fetch命令+parse命令+updatedb命令+invertlinks命令:

inject:     例子1:  ./bin/nutch inject data/crawldb urls

            把要抓取的url注入到crawldb中。url存放在urls文件夹中的所有文件中,注入到data/crawldb中。

            要保证data不存在

generate:      例子: ./bin/nutch generate data/crawldb data/segments

fetch:      例子:./bin/nutch fetch data/segments/20140716205702 -threads 3

parse:     例子:./bin/nutch parse data/segments/20140716205702 

updatedb:   例子: ./bin/nutch updatedb data/crawldb -dir data/segments

mergesegs:   例子:./bin/nutch mergesegs data2/segments_all -dir data2/segments

            要注意,在segments文件夹及其子文件夹中不要有自己另外生成的东西

            非常有用的命令,合并之后文件变小。文件越多越大,合并效果越好。I/O越快
            类似的还有mergedb、mergelinkdb命令

invertlinks:  例子:./bin/nutch invertlinks data/linkdb -dir data/segments

            要注意,在segments文件夹及其子文件夹中不要有自己另外生成的东西。

            计算反向链接分析新输入的segment目录,产生新的反向链接库
            把新产生的反向链接库与原来的库进行合并

            通过计算有多少个网页指向当前网页,来计算当前网页的分值

parsecheker:  例子1:./bin/nutch parsechecker http://apdplat.org

            可以方便的查看网页中有哪些链接

         例子2: ./bin/nutch parsechecker -dumpText http://apdplat.org

            只查看网页中的文本

 域统计

./bin/nutch domainstats data/crawldb/current host host

第一个host是输出目录,第二个host是输出选项

./bin/nutch domainstats data/crawldb/current domain domain

./bin/nutch domainstats data/crawldb/current suffix suffix 

./bin/nutch domainstats data/crawldb/current tld tld

从host级别到tld级别统计信息越来越少,因为后面的url包括前面的url,

假如有网址 http://www.cnblogs.com.cn/,host是www.cnblogs.com.cn,domain是cnblogs.com.cn,suffix是顶级域名com.cn。tld是比顶级域名还高级的域名,在这里就是cn,如果url是http://www.cnblogs.com,那么tld和suffix都是com

webgraph

./bin/nutch webgraph -segmentDir data/segments -webgraphdb data/webgraphdb

指定segments输入路径和webgraphdb输出路径。将在data/webgraphdb生成Outlinks、Inlinks、Nodes

分别对应输出链接及数量,输入链接及数量,url及其分值

第一次执行webgraph命令时,nodes中的所有url的分值为0,因此需要执行linkrank命令

输出链接是保存在parse_data里的,所有OutLinkDb的的输入链接是parse_data

由输出连接可以得到所有网页的输入连接,就能计算每个网页的分值

nodedumper和linkrank

./bin/nutch nodedumper -topn 1 -inlinks -output inlinks_topn_1 -webgraphdb data/webgraphdb
  查看data/webgraphdb里的文件内容,可以看到url和输入链接数量

  -asSequenceFile参数是生成序列文件,因为序列文件是2进制的,这里不用

  -topn,如果有相同的输出链接,只输出topn条

  -inlinks,按输入链接降序排序,类似的还有-outlinks、-scores

  -output,指定输出目录

  -webgraphdb,指定webgraphdb路径

  如果按照scores来排序,在生成的文件中,我们可以看到所有的url分值都为0,

  这说明经过执行webgraph命令,所有的url分值都为0

./bin/nutch linkrank -webgraphdb data/webgraphdb

计算分值并记录起来

然后在用命令:

./bin/nutch nodedumper -topn 1 -scores -output after-inject-scores -webgraphdb data/webgraphdb

可以发现after-inject-scores文件夹里的文件内容里的url分值不再是0

 ./bin/nutch nodedumper -group domain sum -inlinks -output inlink_domain_sum -webgraphdb data/webgraphdb

生成分组数据

domain可以替换成host, sum可以替换成max。这个两个参数要放在-group之后

如果对上面命令再加上-topn 1,输出路径改为inlink_domain_sum_1,会发现这个文件中的有些输入连接数少了

说明nodedumper先进行分组,然后再对每个组中的top1进行求和(和等于每组的最大输入链接数)

注入分值

./bin/nutch scoreupdater -crawldb data/crawldb -webgraphdb data/webgraphdb

crawl命令默认使用了opic插件来计算分值。而webgraph的计算分值方式是从1.0开始有的,

比较完善。

轻量级抓取freegen

./bin/nutch freegen urls2 data3/segments

urls2文件夹中存放了新生成的保存url的文件,有一个url在里边:http://apdplat.org

新生成的段输出到data3/segments

这个命令可以绕过抓取庞大的crawldb库,专门去通过某些url生成segments

配置solr服务器

检查是否配置了索引插件配置是否成功:

./bin/nutch indexchecker http://www.163.com

在显示的信息中title和content比较重要

找了很久才找到3.6.2和4.2.0的下载地址。现在主页上已经不能下载。

这里是solr各版本下载地址:http://archive.apache.org/dist/lucene/solr/

这里使用的是3.6.2

配置solr

1. 把nutch的conf/schema.xml复制到solr的/example/solr/conf中,注意备份solr的schema.xml

 在nutch的conf/nutch-default.xml中搜索index-  ,会找到如下xml段

<property>
  <name>plugin.includes</name>
  <value>protocol-http|urlfilter-regex|parse-(html|tika)|index-(basic|anchor)|scoring-opic|urlnormalizer-(pass|regex|basic)</value>
  <description>Regular expression naming plugin directory names to
  include.  Any plugin not matching this expression is excluded.
  In any case you need at least include the nutch-extensionpoints plugin. By
  default Nutch includes crawling just HTML and plain text via HTTP,
  and basic indexing and search plugins. In order to use HTTPS please enable 
  protocol-httpclient, but be aware of possible intermittent problems with the 
  underlying commons-httpclient library.
  </description>
</property>

可以看到这是把插件include到nutch的配置,index-basic和index-anchor是其中两个插件

进入solr的/example/solr/conf/schema.xml中(即刚才拷贝的nutch的schema.xml)

同样所有index-,其中有两个段

 <!-- fields for index-basic plugin -->
        <field name="host" type="string" stored="false" indexed="true"/>
        <field name="url" type="url" stored="true" indexed="true"
            required="true"/>
        <field name="content" type="text" stored="false" indexed="true"/>
        <field name="title" type="text" stored="true" indexed="true"/>
        <field name="cache" type="string" stored="true" indexed="false"/>
        <field name="tstamp" type="date" stored="true" indexed="false"/>

        <!-- fields for index-anchor plugin -->
        <field name="anchor" type="string" stored="true" indexed="true"
            multiValued="true"/>

field配置了这两个插件的字段,除了field之外,在上面还有id

2. 把solr的/example/solr/conf/solrconfig.xml中的所有的<str name="df">text</str>改成<str name="df">content</str>

默认搜索的字段应该是content

3. 启动solr。到example下运行start.jar: java -jar start.jar &   (后台运行)

如果没有配置第2步,就会报错说找不到text

4. 打开浏览器访问localhost:8983

可以看到solr管理网页。solr内嵌了jetty服务器,因此能够用b/s方式管理solr

5. 回到nutch的local下输入 bin/nutch | grep solr

可以发现有三条命令:solrindex(建solr索引)、solrdedup(去重)、solrclean(去除301永久重定向、404网址)

6. 通过crawldb、linkdb和segments来把索引提交给http://localhost:8983/solr:

在local目录下输入bin/nutch solrindex http://localhost:8983/solr data/crawldb -linkdb data/linkdb -dir data/segments

输出信息中有:Indexing 3 documents

如果超过250个,就会indexing多次,这个可以在conf/nutch-default.xml或nutch-site.xml中配置solr.commit.size(default中有样例)

调高数量可以提高效率,但是更占内存

solr将会把索引保存在example/solr/data/index中

使用Luke

luku是lucene的索引工具箱,可以方便查看和搜索索引,便于调试

下载地址:http://code.google.com/p/luke/downloads/list

这里使用的是:lukeall-4.0.0-ALPHA.jar

将solr的example/solr/data/index目录拷贝到本地(这里我把index目录拷贝到windows桌面,luke的jar包也放在桌面)

双击jar即可运行luke。Luke会自动提示你指向索引文件夹。

如上图,有10个字段,在左下角的框中显示了schema.xml中的1个id,3个core fields,5个index-basic字段(不知道为什么少了一个cache字段),1个index-anchor字段

选中一个字段,再点show top term可以看到具体的分词

id字段是完整的、不分词的

点击Documents标签,可以通过docments数量来查看字段

注意点一下左上角的绿色左箭头(仅仅是为了让框里有内容从而显示字段信息),然后按绿色的右箭头:

在title字段中找一个text,比如2014。点击search标签在左上角的框中输入title:2014 ,再点search,可以搜索到索引。应该指出:可能某些title是搜不出来的,应该确保建索引时的分词器和搜索时的分词器一致!比如title=明星娱乐圈,而用Luke会把整个title给分成4个字再搜索,这样会导致搜索不出。之后会讲到设置分词器。

solr配置mmseg4j分词器

solr自带的分词器对中文分词不好,导致Luke搜索不到索引信息,因此使用mmseg4j

下载地址:https://code.google.com/p/mmseg4j/downloads/list

这里使用的是mmseg4j-1.8.5.zip

1. 把solr停下来。使用jps命令查看进程号, 然后输入kill -9 进程号  关掉solr

2. 删除solr的example/solr/data目录

3. 在solr的example/solr下新建lib文件夹

4. 把mmseg4j中的mmseg4j-all-1.8.5-with-dic.jar拷贝到solr的example/solr/lib中,让solr的服务器加载这个jar包

5. 修改solr的example/solr/conf/schema.xml:

  把<tokenizer class="solr.WhitespaceTokenizerFactory"/>和<tokenizer class="solr.StandardTokenizerFactory"/>

  替换成<tokenizer class="com.chenlb.mmseg4j.solr.MMSegTokenizerFactory" mode="complex"/>

  意思就是配置solr使用mmseg4j的类进行分词,默认的分词方法对中文分词效果不好

之后再打开solr服务器,把索引注入到solr中,然后用Luke打开索引文件夹,在使用搜索中文字段时就能找到了~

可以发现Luke中Number of terms明显变小,这是因为solr配置mmseg4j后把多个汉字分成1个词,原先可能是每个汉字一个词。


另外,可以在localhost:8983/solr/admin页面的Query String: 查询框中也可以搜索,搜索语法和Luke一样,都是"字段名:要查的值"

虽然现在已经配置了solr的分词器,但是Luke还没配置mmseg4j作为分词器,

在Luke中search "title:客户端" 时,会分成3个词(现在Luke还没指定mmseg4j为分词器),

而如果Luke配置了mmseg4j作为分词器时,会把“客户端”当成一个词

Luke配置mmseg4j

虽然luke可以搜索到索引了,solr和luke最好使用同一个分词器。

1.8.5版本跟luke4.0版本有冲突,所以luke使用1.9.1的mmseg4j,下载地址上面有。

这里使用:mmseg4j-1.9.1.v20130120-SNAPSHOT.zip

把mmseg4j-1.9.1的dist中的三个jar包解压出来,并把解压出来的data文件夹和com文件夹复制到Luke的jar中

打开Luke并指定index目录,点击search选项卡,看到右边有一个下拉框选择用来处理分词的类ComplexAnalyzer,在下拉框右边选默认字段为content

搜索:title:客户端

会发现Query Details框中分词就是”客户端”,如果是用原先没配置,会把“客户端”当成3个词

solr4.2

以下内容提到的solr目录均以

1. solr4.2的example/solr/中多了一个collection1文件夹。要把nutch的local/conf/schema-solr4.xml拷贝到solr4.2的example/solr/collection1的conf目录中并重命名为schema.xml

2. solr4.2不需要把schema.xml中的text换成content。怎么看应该修改成那个字段?打开shema.xml,拉到下边有这个标签:

<defaultSearchField>content</defaultSearchField>

solr4.2的这个标签就是text,所以不用改。而solr3.6.2的这个标签是content,所以得把所有的text改成content

3. 在schema.xml的fields标签中加入一个_version_标签,不然启动solr时会报错:

<field name="_version_" type="long" stored="true" indexed="true"/>

4. 启动solr,也是打开start.jar

5. 拷贝mmseg4j-1.9.1的jar。同样,也是拷贝jar就好了,把mmseg4j的dist中的jar拷贝到solr4.2的collection1的lib目录下,注意如果没有lib文件夹要先mkdir

6. 配置mmseg4j。

  修改solr的example/solr/conf/schema.xml:

  把<tokenizer class="solr.WhitespaceTokenizerFactory"/>和<tokenizer class="solr.StandardTokenizerFactory"/>

  替换成<tokenizer class="com.chenlb.mmseg4j.solr.MMSegTokenizerFactory" mode="complex"/>

  注意别把除WhitespaceTokenizerFactory和StandardTokenizerFactory之外的tokenizer给改了

之后就可以提交索引到solr了

提交索引后,可以进入localhost:8983的core admin中查看索引数等信息:

在左下角有个Core Selector下拉框可以选collection1,然后点击下方的query,在右边的界面就能够查询索引了:

Cygwin安装nutch

在windows系统中使用虚拟机来使用linux比较重量级,

我安装的Ubuntu虚拟机需要4G内存才能跑得不那么卡。

Cygwin相比之下比较轻量级,而且能够方便地在cygwin的环境中使用windows的东西。

如果java的目录有包含空格,那么运行 nutch crawl命令时就会出错。

比如使用cygwin时,首先把apache-nutch-1.6复制到cygwin目录下的home/Administrator(取决于你的操作系统用户名)中,

打开cygwin,进入nutch的bin目录下执行./nutch crawl命令,会提示你该目录不存在(如果你的java安装在c:/Program Files/中)。

解决方法:

把整个java目录拷贝到cygwin的home/Administrator目录下,

并设置JAVA_HOME为c:/cygwin/home/Administrator/Java/jdk1.6.0_21 就好了

如果机器上有多个jdk,那就为cygin设置NUTCH_JAVA_HOME。

注意cygwin中的环境变量是windows的目录。

nutch与hadoop

hadoop的教程可以在这里找到:http://www.cnblogs.com/lanhj/p/3841709.html

这里用到的nutch保存在前面用ant编译生成的deploy文件夹,即nutch把job提交给hadoop执行的版本

在nutch的conf/nutch-site.xml中加入http.agent.name的键值对(前面有)就ok啦~

当然hadoop至少要运行自带的WordCount.java成功,并且配上HADOOP_HOME环境变量才行。

还是老样子,先在deploy中生成urls文件夹,在里面生成保存url的文档

然后执行crawl命令:

bin/nutch crawl urls -dir data -threads 50 -depth 2 -topN 1

是不是报出非法输入错误?

原因是:job是hadoop执行的,hadoop默认的目录是HDFS上的目录,因此我们需要把urls上传到HDFS上:

hadoop fs -put urls /user/xxx/

hadoop fs -ls /user/xxx

第一条命令是把urls上传到HDFS的user/xxx目录下(nutch的job要求inject的urls存放在 /user/用户名/ 下),第二条命令是看该目录下有哪些文件。

可以看到urls已经上传到HDFS了(在我的另一篇随笔中,由于我比较懒,暂时没写关于HDFS的概念、命令、api。之前做的ppt和代码都还在,有空再上传)

再次运行crawl命令,

命令执行到半的时候,可以打开http://localhost:50030(hadoop查看mapreduce和jobtracker的页面)

可以看见有Map task或Reduce task

等结束以后,再查看HDFS的/user/xxx目录,可以发现生成了data文件夹

嫌查看HDFS的文件命令麻烦,就打开localhost:80070,然后点击Browse the filesystem查看HDFS上的文件

这个网页只能查看目录以及文档,不能删除、上传、更新

localhost:50060可以查看tasktracker信息

hadoop也内嵌了jetty服务器,所以可以用网页的方式查看hadoop的情况

 7月22日更新完毕

原文地址:https://www.cnblogs.com/lanhj/p/3841301.html