C#中如何过滤掉多余的html代码

View Code
1 /// <summary>
2 /// 过滤html标签
3 /// </summary>
4 /// <param name="strHtml">html的内容</param>
5 /// <returns></returns>
6   public static string StripHTML(string stringToStrip)
7 {
8 // paring using RegEx //
9   stringToStrip = Regex.Replace(stringToStrip, "</p(?:\\s*)>(?:\\s*)<p(?:\\s*)>", "\n\n", RegexOptions.IgnoreCase | RegexOptions.Compiled);
10 stringToStrip = Regex.Replace(stringToStrip, "<br(?:\\s*)/>", "\n", RegexOptions.IgnoreCase | RegexOptions.Compiled);
11 stringToStrip = Regex.Replace(stringToStrip, "\"", "''", RegexOptions.IgnoreCase | RegexOptions.Compiled);
12 stringToStrip = StripHtmlXmlTags(stringToStrip);
13 return stringToStrip;
14 }
15
16 private static string StripHtmlXmlTags(string content)
17 {
18 return Regex.Replace(content, "<[^>]+>", "", RegexOptions.IgnoreCase | RegexOptions.Compiled);
19 }

很方便的,呵呵。过滤掉所有的html标签,

原文地址:https://www.cnblogs.com/wsl2011/p/2055712.html