Nginx 防爬虫设置

conf下  vi 一个文件agent_deny.conf

添加如下内容

#禁止Scrapy|curl等工具的抓取

if ($http_user_agent ~* (Scrapy|Curl|HttpClient))

{

    return 403;

}

#禁止指定UA及UA为空的访问

if ($http_user_agent ~ "FeedDemon|JikeSpider|Indy Library|Alexa Toolbar|AskTbFXTV|AhrefsBot|CrawlDaddy|CoolpadWebkit|Java|Feedly|UniversalFeedParser|ApacheBench|Microsoft URL Control|Swiftbot|ZmEu|oBot|jaunty|Python-urllib|lightDeckReports Bot|YYSpider|DigExt|YisouSpider|HttpClient|MJ12bot|heritrix|EasouSpider|Ezooms|^$" )

{

    return 403;            

}

#禁止非GET|HEAD|POST方式的抓取

if ($request_method !~ ^(GET|HEAD|POST)$)

{

  return 403;

}

最后在 server 下的location里面添加 include agent_deny.conf;

纵有白头俱老意,奈何缘浅路芊芊.
原文地址:https://www.cnblogs.com/hanby/p/14362758.html