python练习题-day18

1、匹配一行文字中的所有开头的字母内容

import re

s="i love you not because of who you are, but because of who i am when i am with you"

import re
content=re.findall(r"w",s)
print(content)

2、匹配一行文字中的所有开头的数字内容

import re

s="i love you not because 12sd 34er 56df e4 54434"

import re
s="i love you not because 12sd 34er 56df e4 54434"
ret=re.findall(r"d+",s)
print(ret)

3、匹配一行文字中的所有开头的数字内容或字母内容123sdf

s="123sdf"
import re
content=re.search("w",s).group()
print(s)

4、只匹配包含字母和数字的行

s="i love you not because 12sd 34er 56 df e4 54434"

content=re.findall(r"w+",s,re.M)

5、写一个正则表达式，使其能同时识别下面所有的字符串：'bat', 'bit', 'but', 'hat', 'hit', 'hut‘

import re

s="'bat', 'bit', 'but', 'hat', 'hit', 'hut"

#方法一
s="'bat', 'bit', 'but', 'hat', 'hit', 'hut"
import re
content=re.findall("w+",s)
print(content)
#方法二
content=re.findall("..t",s)
print(content)

6、匹配所有合法的python标识符

#coding=utf-8

import re

s="awoeur awier !@# @#4_-asdf3$^&()+?><dfg$ $"

s="awoeur awier !@# @#4_-asdf3$^&()+?><dfg$
$"
import re
content=re.findall(".*",s,re.S)
print(content)

7、提取每行中完整的年月日和时间字段

#coding=utf-8

import re

s="""se234 1987-02-09 07:30:00

1987-02-10 07:25:00"""

content=re.findall("d{4}-d{2}-d{2} d{2}:d{2}:d{2}",s)
print(content)

8、将每行中的电子邮件地址替换为你自己的电子邮件地址

#coding=utf-8

import re

s="""693152032@qq.com, werksdf@163.com, sdf@sina.com

sfjsdf@139.com, soifsdfj@134.com

pwoeir423@123.com"""

import re
content=re.subn("w+@w+.com","test@qq.com",s)
print(content,type(content))

9、匹配home关键字，s="skjdfoijower home homewer"

s="skjdfoijower home   homewer"
import re
content=re.findall(r"\home",s)
print(content)

10、使用正则提取出字符串中的单词s="""i love you not because of who 234 you are, 234 but 3234ser because of who i am when i am with you"""

import re
content=re.findall(r"[a-z]+",s,re.I)
print(content)

11、使用正则表达式匹配合法的邮件地址：

import re

s="""xiasd@163.com, sdlfkj@.com sdflkj@180.com solodfdsf@123.com sdlfjxiaori@139.com saldkfj.com oisdfo@.sodf.com.com"""

import re
content=re.findall("w+@w+.com",s)
print(content)

12、去除以下html文件中的标签，只显示文本信息。

<div>

岗位职责：

完成推荐算法、数据统计、接口、后台等服务器端相关工作

 

必备要求：

良好的自我驱动力和职业素养，工作积极主动、结果导向

 

技术要求：

1、一年以上 Python 开发经验，掌握面向对象分析和设计，了解设计模式

2、掌握HTTP协议，熟悉MVC、MVVM等概念以及相关WEB开发框架

3、掌握关系数据库开发设计，掌握 SQL，熟练使用 MySQL/PostgreSQL 中的一种 

4、掌握NoSQL、MQ，熟练使用对应技术解决方案

5、熟悉 Javascript/CSS/HTML5，JQuery、React、Vue.js

 

加分项：

大数据，数理统计，机器学习，sklearn，高性能，大并发。

</div>

import re
content=re.sub("</?w+>| "," ",s)
print(content)

import re
content=re.sub("</?[^>]+>"," ",s)
print(content)

13、将以下网址提取出域名：

http://www.interoem.com/messageinfo.asp?id=35`
http://3995503.com/class/class09/news_show.asp?id=14
http://lib.wzmc.edu.cn/news/onews.asp?id=769
http://www.zy-ls.com/alfx.asp?newsid=377&id=6
http://www.fincm.com/newslist.asp?id=415

p = r"(http://.+?/).+"
 
print(re.sub(p, lambda x : x.group(1), s2))