XML处理

一、JAXP

　　JAVA API For XML Processing，数以J2SE的一部分.

　　1. DOM解析，解析文档时需要读取整个xml文档，在内存中构建代表整个DOM树的document对象，从而再对XML文档进行操作。

　　　　　　　　此种情况下，如果XML文档特别大，容易消耗大量内存，容易导致内存溢出。

public class JAXPDemo {
	public static void main(String[] args) throws Exception {
		DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();//得到解析器工厂
		DocumentBuilder builder = factory.newDocumentBuilder();//得到解析器
		Document doc = builder.parse("files/book.xml");//解析执行的xml文档，得到代表内存DOM树的document对象
		NodeList nl = doc.getElementsByTagName("售价");
		Node n = nl.item(0);
		Element price = doc.createElement("批发价");
		price.setTextContent("28.00元");
		n.getParentNode().insertBefore(price, n);
		TransformerFactory tff = TransformerFactory.newInstance();
		tff.setAttribute("indent-number", new Integer(2));
		Transformer tf = tff.newTransformer();
		tf.setOutputProperty(OutputKeys.INDENT, "yes");
		tf.transform(new DOMSource(doc), new StreamResult(new OutputStreamWriter(new FileOutputStream("files/book.xml"), "GBK")));
	}
}

　　2.SAX解析

　　　采用的是事件处理机制，涉及到两个部分:解析器和事件处理器。原理是，读取xml文档的时候，当读到了文档的一部分时，例如文档的开始、元素的开始、文本、元素的结束、文档的结束，都会调用事件处理器的对应的方法，读到的数据以参数的形式传递给对应的方法。相对于DOM解析方式来说，SAX方式节省内存，但是它只适用于查询操作，不能对xml文档进行修改。

二、 DOM4J

　　 JAXP提供的SAX的API用起来比较麻烦，不够简洁，可以使用DOM4J这个开源包，它“使用起来像DOM一样简单，使用的是SAX解析方式来解析”。DOM4J很强大的一点在于它对XPATH的支持。DOM4J依赖jaxen包。

public class DOM4JDemo {
	public static void main(String[] args) throws DocumentException {
		SAXReader reader = new SAXReader();
		Document doc = reader.read("files/book.xml");
		Element root = doc.getRootElement();
		List<Element> list = root.elements("书");
		Element book0 = list.get(0);
		Element author = book0.element("作者");
		System.out.println(author.getText());
	}
}

　　利用XPATH

public static void dom4jDemo2() throws DocumentException{
		SAXReader reader = new SAXReader();
		Document doc = reader.read("files/book.xml");
		Node node = doc.selectSingleNode("/书架/书[1]/作者");
		Element e = (Element) node;
		System.out.println(e.getText());
	}