day⑥：xml模块

什么是xml？
xml即可扩展标记语言，它可以用来标记数据、定义数据类型，是一种允许用户对自己的标记语言进行定义的源语言
首先，它是有元素对组成，<aa></aa>

元素可以有属性：<aa id=’123’></aa>，id='123'是属性

元素对可以嵌入数据：<aa>abc</aa> , abc是value

元素可以嵌入子元素（具有层级关系）：

<aa>

<bb></bb>

</aa>

python对XML的解析
常见的XML编程接口有DOM和SAX，这两种接口处理XML文件的方式不同，当然使用场合也不同。
python有三种方法解析XML，SAX，DOM，以及ElementTree:
1.SAX (simple API for XML )
pyhton 标准库包含SAX解析器，SAX用事件驱动模型，通过在解析XML的过程中触发一个个的事件并调用用户定义的回调函数来处理XML文件。
2.DOM(Document Object Model)
将XML数据在内存中解析成一个树，通过对树的操作来操作XML。
文件对象模型（Document Object Model，简称DOM），是W3C组织推荐的处理可扩展置标语言的标准编程接口。
一个 DOM 的解析器在解析一个 XML 文档时，一次性读取整个文档，把文档中所有元素保存在内存中的一个树结构里，之后你可以利用DOM 提供的不同的函数来读取或修改文档的内容和结构，也可以把修改过的内容写入xml文件。
3.ElementTree(元素树)
ElementTree就像一个轻量级的DOM，具有方便友好的API。代码可用性好，速度快，消耗内存少。

1.加载xml文件
加载XML文件共有2种方法,一是加载指定字符串，二是加载指定文件

2.获取element的方法
a) 通过getiterator
b) 过 getchildren
c) find方法
d) findall方法

ElementTree 操作xml例子：

test.xml

<?xml version="1.0"?>
<yaobin>
    <country name="Liechtenstein">
        <rank updated="yes">2</rank>
        <year>2008</year>
        <gdppc>141100</gdppc>
        <neighbor name="Austria" direction="E"/>
        <neighbor name="Switzerland" direction="W"/>
    </country>
    <country name="Singapore">
        <rank updated="yes">5</rank>
        <year>2011</year>
        <gdppc>59900</gdppc>
        <neighbor name="Malaysia" direction="N"/>
    </country>
    <country name="Panama">
        <rank updated="yes">69</rank>
        <year>2012</year>
        <gdppc>13600</gdppc>
        <neighbor name="Costa Rica" direction="W"/>
        <neighbor name="Colombia" direction="E"/>
    </country>
</yaobin>

一.遍历：

#!/usr/bin/env python
#coding=utf-8
import xml.etree.cElementTree as ET
tree=ET.parse("test.xml")
root=tree.getroot()
print(root.tag)
for child in root:
    print(child.tag,child.attrib)
    for i in child:
        print("---->",i.tag,i.text)
for node in root.iter('year'):
    print(node.tag,node.text)

二.修改和删除

#!/usr/bin/env python
#coding=utf-8
import xml.etree.cElementTree as ET
tree=ET.parse("test.xml")
root=tree.getroot()
print(root.tag)
for node in root.iter('year'):
    new_year=int(node.text) +100
    node.text=str(new_year)
    node.set("new_attrib","attrib_value")
tree.write("new_test.xml")
for country in root.findall('country'):
    rank=int(country.find('rank').text)
    if rank >50:
        root.remove(country)
tree.write("new_test2.xml")

三.自己创建xml文档

#!/usr/bin/env python
#coding=utf-8
import xml.etree.ElementTree as ET
new_xml = ET.Element("namelist")  #root
name = ET.SubElement(new_xml,"name",attrib={"enrolled":"yes"})
age = ET.SubElement(name,"age",attrib={"checked":"no"})
sex = ET.SubElement(name,"sex")
age.text = '33'
sex.text="boy"
name2 = ET.SubElement(new_xml,"name",attrib={"enrolled":"no"})
age = ET.SubElement(name2,"age")
sex = ET.SubElement(name2,"sex")
age.text = '19'
sex.text='girl'
et = ET.ElementTree(new_xml) #生成文档对象
et.write("my_test.xml", encoding="utf-8",xml_declaration=True)
ET.dump(new_xml) #打印生成的格式

来自为知笔记(Wiz)