学习笔录一:ArachNode.Net1.2

1、首先运行存储过程:dbo.arachnode_usp_arachnode.net_RESET_DATABASE或者从类 Arachnode.Console。Pragram.cs中执行

ArachnodeDAO arachnodeDAO = new ArachnodeDAO();
arachnodeDAO.ExecuteSql("EXEC [dbo].[arachnode_usp_arachnode.net_RESET_DATABASE]");

 _crawler.Crawl(new CrawlRequest(new Discovery("http://taobao.com"), int.MaxValue, UriClassificationType.Domain | UriClassificationType.FileExtension, UriClassificationType.Domain | UriClassificationType.FileExtension, 1));

2、在SQL  Server 2008数据库中,对表cfg.Configuration执行如下一段代码:

use [arachnode.net]
  update cfg.Configuration
  set Value = 'D:/LuceneDotNetIndex/Index'
  where [KEY] = 'LuceneDotNetIndexDirectory'
 
  update cfg.Configuration
  set Value = 'D:/LuceneDotNetIndex/DownloadedFiles'
  where [KEY] = 'DownloadedFilesDirectory'

  update cfg.Configuration
  set Value = 'D:/LuceneDotNetIndex/DownloadedImages'
  where [KEY] = 'DownloadedImagesDirectory'
 
  update cfg.Configuration
  set Value = 'D:/LuceneDotNetIndex/DownloadedWebPages'
  where [KEY] = 'DownloadedWebPagesDirectory'
 
  update cfg.Configuration
  set Value = 'D:/LuceneDotNetIndex/ConsoleOutputLogs'
  where [KEY] = 'ConsoleOutputLogsDirectory'

3、将数据库中的表cfg.CrawlActions中的字段

AutoCommit=true|LuceneDotNetIndexDirectory=D:/LuceneDotNetIndex/Index|CheckIndexes=false|RebuildIndexOnLoad=false|WebPageIDLowerBound=1|WebPageIDUpperBound=100000

 

4、配制数据库的链接:

Arachnode.Configuration中的

 connectionString="Data Source=HENRYWEN-TUCU/SQLEXPRESS;Initial Catalog=arachnode.net;Integrated Security=True;Connection Timeout=3600;"或者项目Function右键--属性--数据库--连接字符

5、去掉开发工具(VS2008):look up turning off 'Just My Code' - this is a Visual Studio option

工具--选项--调试--去掉启用仅我的代码

附下载工具(SVN):TortoiseSVN

原文地址:https://www.cnblogs.com/wenrenhua08/p/3993612.html