Javascript 正则匹配

一、验证网站

1.https://regexper.com/

2.http://www.regexr.com/

Character classes
.	any character except newline
w d s	word, digit, whitespace
W D S	not word, digit, whitespace
[abc]	any of a, b, or c
[^abc]	not a, b, or c
[a-g]	character between a & g
Anchors
^abc$	start / end of the string
B	word, not-word boundary
Escaped characters
. * \	escaped special characters
	tab, linefeed, carriage return
u00A9	unicode escaped ©
Groups & Lookaround
(abc)	capture group
1	backreference to group #1
(?:abc)	non-capturing group
(?=abc)	positive lookahead
(?!abc)	negative lookahead
Quantifiers & Alternation
a* a+ a?	0 or more, 1 or more, 0 or 1
a{5} a{2,}	exactly five, two or more
a{1,3}	between one & three
a+? a{2,}?	match as few as possible
ab\|cd	match ab or cd

二、常用正则

1.去掉字符串中的html标签

var str = "<span style='display:none;'>This is test</span><img src=''>ss</img><strong></strong><br/>";
str.replace(/<[^>]+>/g,"");//去掉所有的html标记

去掉网页中的所有的html标签

string temp = Regex.Replace(html, "<[^>]*>", ""); //html是一个要去除html标记的文档

2.得到网页上的链接地址

string matchString = @"<a[^>]+href=s*(?:'(?<href>^']+)'|""(?<href>[^""]+)""|(?<href>[^>s]+))s*[^>]*>";

3.去掉CSS样式

$content = str.replace(/<!--[^>]*-->/i, "");//注释内容  
$content = str.replace(/style=.+?['|"]/i,'');//去除样式  
$content = str.replace(/class=.+?['|"]/i,'');//去除样式
$content = str.replace(/id=.+?['|"]/i,'');//去除样式
$content = str.replace(/lang=.+?['|"]/i,'');//去除样式  
$content = str.replace(/width=.+?['|"]/i,'');//去除样式  
$content = str.replace(/height=.+?['|"]/i,'');//去除样式  
$content = str.replace(/border=.+?['|"]/i,'');//去除样式  
$content = str.replace(/face=.+?['|"]/i,'');//去除样式  
$content = str.replace(/face=.+?['|"]/,'');//去除样式 只允许小写 正则匹配没有带 i 参数

4.email正则表达式

zhangshan@163.com,abc@sina.com.cn , zhangshna.Mr@163.com , abc_Wang.dd@sian.com , abc_Wang.dd.cc@sian.com 这种类似的形式,在@符号之前还有点.

/^[a-zA-Z0-9_-]+@[a-zA-Z0-9_-]+(.[a-zA-Z0-9_-]+)+$/; //原来的正则表达式

/^(w)+(.w+)*@(w)+((.w{2,3}){1,3})$/; //修改后的正则表达式

/^(w)+(.w+)*@(w)+((.w+)+)$/; //修改后的正则表达式

字符描述：

- ^ ：匹配输入的开始位置。
- ：将下一个字符标记为特殊字符或字面值。
- * ：匹配前一个字符零次或几次。
- + ：匹配前一个字符一次或多次。
- (pattern) 与模式匹配并记住匹配。
- x|y：匹配 x 或 y。
- [a-z] ：表示某个范围内的字符。与指定区间内的任何字符匹配。
- w ：与任何单词字符匹配，包括下划线。
- {n,m} 最少匹配 n 次且最多匹配 m 次
- $ ：匹配输入的结尾。

参考: php 去除html标签和CSS等 https://gist.github.com/wdd2007/3713543

email正则表达式 http://www.cnblogs.com/vs-bug/archive/2010/03/26/1696752.html