java正则表达式实战例子,持续更新,记下来后面就不用重新写了。。。

1.去掉HTML标签:

/**
 * 去掉HTML外面的标签
 * @author CY
 *
 */
public class TrimHTML {
    public static void main(String[] args) {
        String d3 = "<div id='mylinks'><a id='blog_nav_sitehome' class='menu' href='http://www.cnblogs.com/'>博客园</a> &nbsp;<a id='blog_nav_myhome' class='menu' href='http://www.cnblogs.com/tenWood/'>首页</a> &nbsp;<a id='blog_nav_newpost' class='menu' rel='nofollow' href='https://i.cnblogs.com/EditPosts.aspx?opt=1'>新随笔</a> &nbsp;<a id='blog_nav_contact' class='menu' rel='nofollow' href='https://msg.cnblogs.com/send/%E6%9C%89%E7%82%B9%E6%87%92%E6%83%B0%E7%9A%84%E5%B0%8F%E9%9D%92%E5%B9%B4'>联系</a> &nbsp;<a id='blog_nav_rss' class='menu' href='http://www.cnblogs.com/tenWood/rss'>订阅</a><a id='blog_nav_rss_image' href='http://www.cnblogs.com/tenWood/rss'><img src='//www.cnblogs.com/images/xml.gif' alt='订阅'></a>&nbsp;<a id='blog_nav_admin' class='menu' rel='nofollow' href='https://i.cnblogs.com/'>管理</a></div>";
        String result = d3.replaceAll("<[^<>]+>", "");
        System.out.println(result);
    }
}

打印如下:

博客园 &nbsp;首页 &nbsp;新随笔 &nbsp;联系 &nbsp;订阅&nbsp;管理

 方法二:(参考博客:http://www.cnblogs.com/devinzhang/archive/2012/05/09/2491619.html

public class TrimHTML2 {
    public static void main(String[] args) {
        String d3 = "<div id='mylinks'><a id='blog_nav_sitehome' class='menu' href='http://www.cnblogs.com/'>博客园</a> &nbsp;<a id='blog_nav_myhome' class='menu' href='http://www.cnblogs.com/tenWood/'>首页</a> &nbsp;<a id='blog_nav_newpost' class='menu' rel='nofollow' href='https://i.cnblogs.com/EditPosts.aspx?opt=1'>新随笔</a> &nbsp;<a id='blog_nav_contact' class='menu' rel='nofollow' href='https://msg.cnblogs.com/send/%E6%9C%89%E7%82%B9%E6%87%92%E6%83%B0%E7%9A%84%E5%B0%8F%E9%9D%92%E5%B9%B4'>联系</a> &nbsp;<a id='blog_nav_rss' class='menu' href='http://www.cnblogs.com/tenWood/rss'>订阅</a><a id='blog_nav_rss_image' href='http://www.cnblogs.com/tenWood/rss'><img src='//www.cnblogs.com/images/xml.gif' alt='订阅'></a>&nbsp;<a id='blog_nav_admin' class='menu' rel='nofollow' href='https://i.cnblogs.com/'>管理</a></div>我是尾巴";
        
        Pattern p = Pattern.compile("<([^>]*)>", Pattern.CASE_INSENSITIVE);
        Matcher m = p.matcher(d3);
        StringBuffer sb = new StringBuffer();
        while(m.find()){
            m.appendReplacement(sb, ""); 
        }
        m.appendTail(sb);   
        System.out.println(sb.toString());
    }
}

打印:

博客园 &nbsp;首页 &nbsp;新随笔 &nbsp;联系 &nbsp;订阅&nbsp;管理我是尾巴

------

原文地址:https://www.cnblogs.com/tenWood/p/7588345.html