YaCy is a free search engine


http://yacy.net

Web Search by the people, for the people

YaCy is a free search engine that anyone can use to build a search portal for their intranet or to help search the public internet. When contributing to the world-wide peer network, the scale of YaCy is limited only by the number of users in the world and can index billions of web pages. It is fully decentralized, all users of the search engine network are equal, the network does not store user search requests and it is not possible for anyone to censor the content of the shared index. We want to achieve freedom of information through a free, distributed web search which is powered by the world's users.

Decentralization

Imagine if, rather than relying on the proprietary software of a large professional search engine operator, your search engine was run by many private computers which aren't under the control of any one company or individual. Well, that's what YaCy does! The resulting decentralized web search currently has about 1.4 billion documents in its index (and growing - download and install YaCy to help out!) and more than 600 peer operators contribute each month. About 130,000 search queries are performed with this network each day.

Live image of the 'freeworld' network

There are already several search networks based on YaCy: the two major networks are the 'freeworld' network (which is the default public network that you join when you load the standard installation of YaCy) and the Sciencenet of theKarlsruhe Institut of Technology which focuses on scientific content. Other YaCy networks exist as TOR hidden services, local intranet services and on WiFi networks too.

Installation is easy!

The installation takes only three minutes. Just download the release, decompress the package and run the start script. On linux you need OpenJDK6. You don't need to install external databases or a web server, everything is already included in YaCy.

 

Search Engine Technology

YaCy is a complete search appliance with user interface, index, administration and monitoring. The following diagram shows its components: 

YaCy harvests web pages with a web crawler. Documents are then parsed, indexed and the search index is stored locally. If your peer is part a peer network, then your local search index is also merged into the shared index for that network. If a search is started then the local index contributes together with a global search index from peers in the YaCy search network.

 

Peer-to-Peer Networking

YaCy peers continuously exchange index fragments using a Distributed Hash Table. Index data can therefore reach the local peer even before a user query is submitted, but of course it is still loaded from the remote peer network too when needed. 

Components

YaCy consists of a variety of components that serve the networking, administration and maintenance of the index with blacklists, moderation functions and community communication. The following graph shows components in YaCy: 

原文地址:https://www.cnblogs.com/lexus/p/2209562.html