python 从给定的URL中提取顶级域名(TLD)

安装

PyPI的最新稳定版本:

1 pip install tld

或者GitHub的最新稳定版本:

1 pip install https://github.com/barseghyanartur/tld/archive/stable.tar.gz

或BitBucket的最新稳定版本:

1 点击安装https://bitbucket.org/barseghyanartur/tld/get/stable.tar.gz

用法示例

从给定的URL 获取TLD名称作为字符串

1 from tld import get_tld
2 
3 get_tld("http://www.google.co.uk")
4 # 'co.uk'
5 
6 get_tld("http://www.google.idontexist", fail_silently=True)
7 # None

获取TLD作为对象

 1 from tld import get_tld
 2 
 3 res = get_tld("http://some.subdomain.google.co.uk", as_object=True)
 4 
 5 res
 6 # 'co.uk'
 7 
 8 res.subdomain
 9 # 'some.subdomain'
10 
11 res.domain
12 # 'google'
13 
14 res.tld
15 # 'co.uk'
16 
17 res.fld
18 # 'google.co.uk'
19 
20 res.parsed_url
21 # SplitResult(
22 #     scheme='http',
23 #     netloc='some.subdomain.google.co.uk',
24 #     path='',
25 #     query='',
26 #     fragment=''
27 # )

获取TLD名称,忽略丢失的协议

1 from tld import get_tld, get_fld
2 
3 get_tld("www.google.co.uk", fix_protocol=True)
4 # 'co.uk'
5 
6 get_fld("www.google.co.uk", fix_protocol=True)
7 # 'google.co.uk'

将TLD部件作为元组返回

1 from tld import parse_tld
2 
3 parse_tld('http://www.google.com')
4 # 'com', 'google', 'www'

从给定的URL 获取第一级域名作为字符串

1 from tld import get_fld
2 
3 get_fld("http://www.google.co.uk")
4 # 'google.co.uk'
5 
6 get_fld("http://www.google.idontexist", fail_silently=True)
7 # None




good good study ,day day up !!!




原文地址:https://www.cnblogs.com/ltn26/p/10007860.html