利用Elasticsearch搭建全球域名解析记录

 

前言

数据来源,由Rapid7收集并提供下载
https://scans.io/study/sonar.fdns

下载Elasticsearch 2.3

ElasticSearch是一个基于Lucene开发的搜索服务器,具有分布式多用户的能力,ElasticSearch是用Java开发的开源项目(Apache许可条款),基于Restful Web接口,能够达到实时搜索、稳定、可靠、快速、高性能、安装使用方便,同时它的横向扩展能力非常强,不需要重启服务。
Elasticsearch 高版本和低版本有细微的差别,大多数中文文档都是关于低版本的
https://www.elastic.co/downloads/past-releases/elasticsearch-2-3-0

安装head插件

elasticsearch-head是一个web前端工具,可以用来和ElasticSearch集群进行可视化交互

安装好jdk

bin/elasticsearch.bat
bin/plugin.bat install mobz/elasticsearch-head

https://github.com/mobz/elasticsearch-head

建立索引并创建映射

PUT /test

{
    "settings": {
        "index": {
            "number_of_shards": "5",
            "number_of_replicas": "0"
        }
    },
    "mappings": {
        "my_type": {
            "properties": {
                "title": {
                    "type": "string",
                    "index": "not_analyzed"
                },
                "name" : {
                    "type" : "string"
                }
            }
        }
    }
}

测试映射

GET /test/_analyze

{
  "field": "title",
  "text": "Blacdfdsfk-cats@qq.com"
}

添加单条数据

POST /test/my_type/

{
    "title": "Blacdfdsfk-cats@qq.com",
    "name":  "Blacdfdsfk-cats@qq.com",
}

简单搜索

GET /test/my_type/_search?q=name:cats

https://www.elastic.co/guide/en/elasticsearch/reference/2.3/search-uri-request.html

利用请求体进行结构化搜索

GET /test/my_type/_search?q=name:cats

{
    "query": {
        "prefix": {
            "name": "blacdfdsfk"
        }
    }
}

自定义分析器

包含字符过滤器,分词器,标记过滤器三部分

由于是dns数据,需要根据特定的情况自定义分析器,将词逆转,分割符设为”.”等

PUT /my_index
{
    "settings": {
        "analysis": {
            "analyzer": {
                "domain_name_analyzer": {
                    "filter":"lowercase",
                    "tokenizer": "domain_name_tokenizer",
                    "type": "custom"
                }
            },
            "tokenizer": {
                "domain_name_tokenizer": {
                    "type": "PathHierarchy",
                    "delimiter": ".",
                    "reverse": true
                }
            }
        }
    }
}

PUT /test_index/_mapping/site
{
    "properties": {
        "url": {
            "type":      "string",
            "analyzer":  "domain_name_analyzer"
        }
    }
}

导入数据测试

PUT /dnsrecords
{
    "settings": {
        "index": {
            "number_of_shards": "5",
            "number_of_replicas": "0"
        },
        "analysis": {
            "analyzer": {
                "domain_name_analyzer": {
                    "filter":"lowercase",
                    "tokenizer": "domain_name_tokenizer",
                    "type": "custom"
                }
            },
            "tokenizer": {
                "domain_name_tokenizer": {
                    "type": "PathHierarchy",
                    "delimiter": ".",
                    "reverse": true
                }
            }
        }
    },
    "mappings": {
        "forward": {
            "properties": {
                "domain": {
                    "type": "string",
                    "analyzer":  "domain_name_analyzer"
                },
                "type" : {
                    "type" : "string",
                    "index": "not_analyzed"
                },
                "record" :{
                    "type": "string",
                    "index": "not_analyzed"
                }
            }
        }
    }
}

查询

GET /dnsrecords/forward/_search HTTP/1.1

{
    "query": {
        "term": {
            "domain": "qidian.com"
        }
    }
}

参考
https://github.com/Pynow/elasticsearch
http://wiki.jikexueyuan.com/project/elasticsearch-definitive-guide-cn/

原文地址:https://www.cnblogs.com/icez/p/6874731.html