XML

XML的学习

语法：首先在xml中的标签我们是可以直接定义的，也就是说，这是一个针对我们程序的一个配置文件

　　 1.在xml的标签中一样和html中会有开始和结束标签 eg <student> student </student>,但是不同于html来说，他是严格区分大小写，并且要正确来配对的。

　　 2.在xml的标签名里面是不能使用空格的，不能由数字开头，但是可以由下划线开头

　　 3.在xml的标签内也可以设置属性，这种就类似html

　　 4. 在一个xml的中根标签只能有一个。

　　 5，一个标签内可以有多个属性，但不能出现重复的属性名 eg <student name= "wang" id = "01"> student </student>

　　 6.属性值必须以引号包含，不能省略，也不可单双引号混用。

编码问题：

　　首先，你得知道将xml文件能到浏览器上解析这个过程其实就是一个字符->字节到硬盘，然后字节->字节到浏览器的过程

　　1. 保存xml的时候就是编码的过程

　　　 2. 打开/解析xml文件的时候就是解码的过程

　　然后解决这个问题的两个主要的步骤

　　　　1) 保持以上两个地方的编码是完全一致的

　　　　2)编码是要支持中文的即可

注意的地方就是 encoding = "utf-8" 是解析时要用的解码方式，你保存的时候的编码的方式是自己设置的，特别是在记事本中保存的时候，如果在eclipse就不会存在这个问题，因为他会根据你的解码方式编码。

DOM 解析： xml解析器一次性把整个xml文档加载进内存，然后在内存中构建一颗Document的对象树，然后通过Document对象，得到树上的节点对象，，通过节点对象访问到xml文档的内容。

用到的工具 -- Dom4j （需要导包）

下面就是重头戏了

　　通Dom4来解析xml文件，相应的代码下图展示

　　一：得到节点

　　>main 方法里面主要是准备工作 -- 创建一个xml的解析器，然后用这个解析器解析出一个document

　　>getnNode方法里面只要是写了如何得到document里面的节点

　　>getAllNodes方法里面主要是写怎样用一个简单的方法得到所有的节点

package javaDom4j;


import java.io.File;
import java.util.Iterator;

import javax.management.RuntimeErrorException;

import org.dom4j.Document;
import org.dom4j.DocumentException;
import org.dom4j.Element;
import org.dom4j.Node;
import org.dom4j.io.SAXReader;
import org.junit.Test;

public class Dom4jRead {
    private static final Element Element = null;

    public static void main(String[] args) {
        // step 1. create a parser of xml
        SAXReader saxReader = new SAXReader();
        // step 2. read xml document
        try {
            Document doc = saxReader.read(new File("Students.xml"));
            System.out.println(doc);
        } catch (DocumentException e) {
            e.printStackTrace();
            throw new RuntimeException(e);
        }

        
    }
    @Test
    public void  getNode() {
        // step 1. create a parser of xml
                SAXReader saxReader = new SAXReader();
                try {
                    Document doc = saxReader.read(new File("Students.xml"));
                    // step 2 : get a iterator of the present node's son node(pay attention -- 
                    //the iterator only contain the son nodes of the present node,
                    //not contain the grandson and someone like that) 
                    Iterator<Node> it = doc.nodeIterator();
                    // step 3 ： then u can go through the nodes but only the present node's son nodes
                    while(it.hasNext()){
                        Node node = it.next();
                        System.out.println(node.getName());
                        
                        //step 4 : get the node's son nodes again 
                        // pay attention : only the element has the iterator,the text or attribute node hasn't iterator
                        if(node instanceof Element) {
                            Element element = (Element) node;
                            Iterator<Node> its = element.nodeIterator();
                            while(its.hasNext()){
                                Node nodes = its.next();
                                System.out.println(nodes.getName());
                            }
                        }
                    }
                } catch (DocumentException e) {
                    e.printStackTrace();
                    throw new RuntimeException(e);
                }
      
    }
    // 其实，可以看出上面的方法比较麻烦，每次去下面的一个子节点就要再写一次while所以我们再次想另一个方法来去节点
    
    @Test
    public void getAllNodes(){
        // step 1. create a parser of xml
        SAXReader saxReader = new SAXReader();
        try {
            Document doc = saxReader.read(new File("Students.xml"));
            
            //step 2 : get the rootElement-- because the root element is unique in a xml,so it can be getted with a method
            Element ele = doc.getRootElement();
            //step 3 : use a method to get the present's son nodes
            getChildNodes(ele);
            
        } catch (DocumentException e) {
            e.printStackTrace();
            throw new RuntimeException(e);
        }

    }
    
    private void getChildNodes(Element element) {
        System.out.println(element.getName());
        
        Iterator<Node> nods = element.nodeIterator();
        while(nods.hasNext()) {
            Node node = nods.next();
            
            // step 4 : judge the node is the element?
            // 用了一个简单的递归
            if(node instanceof Element) {
                getChildNodes((Element)node);
            }
        }
    }

}

简单语法 --　　　　

　1. 得到当前节点的子节点方法
　　　　Iterator Element.nodeIterator()

　2. 得到根标签

　　　　Element getRootElement()

　二. 得到标签和属性及文本

package javaDom4j;

import java.util.Iterator;
import java.util.List;

import org.dom4j.Attribute;
import org.dom4j.Document;
import org.dom4j.Element;
import org.dom4j.io.SAXReader;
import org.junit.Test;

public class Dom4jGetELE {
    @Test
    public void getElement() throws Exception {
        SAXReader reader = new SAXReader();
        Document document = reader.read("Students.xml");
        
        // step 1 : get the rootElement 
        Element rootElm = document.getRootElement();
        // step 2 : according the name to get the element
        Element nameEle = rootElm.element("name");
        System.out.println(nameEle.getName());
        
        // step 3 : get all son elements of rootEle
        List<Element> elemLists = rootElm.elements();
        
        // step 4 : go through the list
        for(Element element : elemLists) {
            System.out.println(element.getName());
        }
        
        // 但是要获取更深层的标签只能一层层去取
    }
    @Test
    public void getAttribute() throws Exception {
        SAXReader reader = new SAXReader();
        Document document = reader.read("Students.xml");
        
        // step 1 : get the element first
        Element idElement = document.getRootElement().element("id");
        // step 2 : according the name of attribute to get the value 
        String  idvalue = idElement.attributeValue("idN");
        System.out.println(idvalue);
        
        // step 3 : according to name of attribute to get the object
        Attribute idAttribute = idElement.attribute("idN");
        System.out.println(idAttribute.getName()+": "+ idAttribute.getValue());
        
        // step 4 : to get the all attributes without knowming the name
        // 4.1 -- list
        List<Attribute> attList = idElement.attributes();
        for(Attribute attribute : attList) {
            System.out.println(attribute.getName()+"="+attribute.getValue());
        }
        
        // 4.2 -- iterator 
        Iterator<Attribute> attIterator  = idElement.attributeIterator();
        while(attIterator.hasNext()) {
            Attribute a = attIterator.next();
            
            System.out.println(a.getName()+"="+a.getValue());
        }
        
    }
    
    @Test
    public void getText() throws Exception {
        SAXReader reader = new SAXReader();
        Document document = reader.read("Students.xml");
        
        // step 1 : get the element first
        Element idElement = document.getRootElement().element("id");
        // step 2 : get the text
        // 2.1
        String idTextString = idElement.getTextTrim();
        System.out.println(idTextString);
        // 2.2 获取当前标签的指定名称的子标签的文本内容
        String careerText = document.getRootElement().element("skill").elementTextTrim("career");
        System.out.println(careerText);
    }
    
}

完整将xml里面的内容变成字符串输出例子

package javaDom4j;

import java.util.Iterator;
import java.util.List;

import org.dom4j.Attribute;
import org.dom4j.Document;
import org.dom4j.Element;
import org.dom4j.Node;
import org.dom4j.Text;
import org.dom4j.io.SAXReader;
import org.junit.Test;


// 完整的读取xml文件--联系
public class Dom4jReadeStudents {
    public static void main(String[] args) throws Exception {
        SAXReader reader = new SAXReader();
        Document doc = reader.read("Students.xml");
        
        Element rootEle = doc.getRootElement();
        
//        getAllElements(rootEle);
//        StringBuffer sb = new StringBuffer();
//        getChildEle(rootEle, sb);
//        System.out.println(sb);
        
        StringBuffer sb2 =new StringBuffer();
        getChildNODE(rootEle, sb2);
        System.out.println(sb2);
    }
    
    @Test
    // get the root element's son elements
    static private void getAllElements(Element e) {
        System.out.println(e.getName());
        
        // get attributes of elements
        List<Attribute> attriList = e.attributes();
        for (Attribute attribute : attriList) {
            System.out.println(attribute.getName()+"=" +attribute.getValue());
        }
        
        // get the text
        String content = e.getTextTrim();
        System.out.println(content);
    
        List<Element> sonElements = e.elements();
        // 遍历
        for (Element element : sonElements) {
            getAllElements(element);
        }
        
    }
    
    // get the content in xml in order 
    // 这种方式取文本有点难过
    static private void getChildEle(Element element, StringBuffer sb) {
        // 开始标签的设置
        sb.append("<"+element.getName());
        List<Attribute> attList = element.attributes();
        if(attList != null){
            for (Attribute attribute : attList) {
                sb.append(" "+attribute.getName()+ "="" + attribute.getValue()+""");
            }
        }
        sb.append(">");
        sb.append("
");
        sb.append("	"+element.getTextTrim());
        
        List<Element> sonElements = element.elements();
        // 遍历
        if(sonElements!= null) {
            for (Element e : sonElements) {
                getChildEle(e, sb);
            }
        }
        
        // 结束标签的显示
        sb.append("
");
        sb.append("</"+element.getName()+">"+"
");
        
    }
        static private void getChildNODE(Element element, StringBuffer sb) {
            // 开始标签的设置
            sb.append("<"+element.getName());
            List<Attribute> attList = element.attributes();
            if(attList != null){
                for (Attribute attribute : attList) {
                    sb.append(" "+attribute.getName()+ "="" + attribute.getValue()+""");
                }
            }
            sb.append(">");

            // 下面改成取节点--这样可以保留空格换行，因为这也是一个文本节点，可以直接保留下来
            Iterator<Node> iterator = element.nodeIterator();
            while(iterator.hasNext()) {
                // 如果是标签
                Node node = iterator.next();
                if(node instanceof Element){
                    Element el = (Element)node;
                    getChildNODE(el, sb);
                }
                //如果是文本
                if(node instanceof Text) {
                    Text text = (Text) node;
                    sb.append(text.getText());
                }
            }
        
            
            // 结束标签的显示

            sb.append("</"+element.getName()+">");
            
        }
}

结果