Java中实现SAX解析xml文件到MySQL数据库

大致步骤:

1、Java bean

2、DBHelper.java

3、重写DefaultHandler中的方法:MyHander.java

4、循环写数据库:SAXParserDemo.java

①xml文件:(要把第二行dtd的绑定删掉)

 1 <?xml version="1.0"  encoding="utf-8" ?>
 2 <!DOCTYPE dblp SYSTEM "dblp.dtd">
 3 <dblp>
 4 
 5 <article mdate="2017-05-11" key="journals/ijtm/LuMW10">
 6     <author>Iuan-Yuan Lu</author>
 7     <author>Chih-Jen Mao</author>
 8     <author>Chun-Hsien Wang</author>
 9     <title>Intrafirm technology and knowledge transfer: a best practice perspective.</title>
10     <pages>338-356</pages>
11     <year>2010</year>
12     <volume>49</volume>
13     <journal>IJTM</journal>
14     <number>4</number>
15     <ee>https://doi.org/10.1504/IJTM.2010.030162</ee>
16     <url>db/journals/ijtm/ijtm49.html#LuMW10</url>
17 </article>
18 
19 <article mdate="2017-05-11" key="journals/ijtm/TingTH14">
20 <author>Kuo-Chang Ting</author>
21 <author>Ping Ho Ting</author>
22 <author>Po-Wen Hsiao</author>
23 <title>Why are bloggers willing to share their thoughts via travel blogs?</title>
24 <pages>89-108</pages>
25 <year>2014</year>
26 <volume>64</volume>
27 <journal>IJTM</journal>
28 <number>1</number>
29 <ee>https://doi.org/10.1504/IJTM.2014.059237</ee>
30 <url>db/journals/ijtm/ijtm64.html#TingTH14</url>
31 </article>
32 
33 <article mdate="2017-05-11" key="journals/ijtm/Howells00">
34     <author>Jeremy Howells</author>
35     <title>International coordination of technology flows and knowledge activity in innovation.</title>
36     <pages>806-819</pages>
37     <year>2000</year>
38     <volume>19</volume>
39     <journal>IJTM</journal>
40     <number>7/8</number>
41 </article>
42 </dblp>
dblp.xml

就留下了三组数据。之前的数据量太大了。(我把所有的数据贴到最后,你们也可以去网上下载)

②DBHelper.java

  1 package sax;
  2 
  3 import java.sql.Connection;
  4 import java.sql.DriverManager;
  5 import java.sql.PreparedStatement;
  6 import java.sql.ResultSet;
  7 import java.sql.Statement;
  8 
  9 /**
 10  * 数据库工具类,负责完成打开、关闭数据库,执行查询或更新
 11  * @author MKing
 12  *
 13  */
 14 public class DbHelper {
 15     /**
 16      * 数据库URL
 17      */
 18     private static final String URL = "jdbc:mysql://localhost:3306/train";
 19     /**
 20      * 登录用户名
 21      */
 22     private static final String USER = "root";
 23     /**
 24      * 登录密码
 25      */
 26     private static final String PASSWORD = "yourpassword";
 27     
 28     private static Connection connection = null;
 29     private static Statement statement = null;
 30 
 31     private static DbHelper helper = null;
 32 
 33     static {
 34         try {
 35             Class.forName("com.mysql.jdbc.Driver");
 36         } catch (ClassNotFoundException e) {
 37             e.printStackTrace();
 38         }
 39     }
 40 
 41     public DbHelper() throws Exception {
 42         connection = DriverManager.getConnection(URL, USER, PASSWORD);
 43         statement = connection.createStatement();
 44     }
 45 
 46     /**
 47      * 返回单例模式的数据库辅助对象
 48      * 
 49      * @return
 50      * @throws Exception 
 51      */
 52     public static DbHelper getDbHelper() throws Exception {
 53         if (helper == null || connection == null || connection.isClosed())
 54             helper = new DbHelper();
 55         return helper;
 56     }
 57 
 58     /**
 59      * 执行查询
 60      * @param sql 要执行的SQL语句
 61      * @return  查询的结果集对象
 62      * @throws Exception
 63      */
 64     public ResultSet executeQuery(String sql) throws Exception {
 65         if (statement != null) {
 66             return statement.executeQuery(sql);
 67         }
 68 
 69         throw new Exception("数据库未正常连接");
 70     }
 71 
 72     /**
 73      * 执行查询
 74      * @param sql  要执行的带参数的SQL语句
 75      * @param args  SQL语句中的参数值
 76      * @return  查询的结果集对象
 77      * @throws Exception
 78      */
 79     public ResultSet executeQuery(String sql, Object...args) throws Exception {
 80         if (connection == null || connection.isClosed()) {
 81             DbHelper.close();
 82             throw new Exception("数据库未正常连接");
 83         }
 84         PreparedStatement ps = connection.prepareStatement(sql);
 85         int index = 1;
 86         for (Object arg : args) {
 87             ps.setObject(index, arg);
 88             index++;
 89         }
 90         
 91         return ps.executeQuery();
 92     }
 93     
 94     /**
 95      * 执行更新
 96      * @param sql  要执行的SQL语句
 97      * @return  受影响的记录条数
 98      * @throws Exception
 99      */
100     public int executeUpdate(String sql) throws Exception {
101         if (statement != null) {
102             return statement.executeUpdate(sql);
103         }
104         throw new Exception("数据库未正常连接");
105     }
106     
107     /**
108      * 执行更新
109      * @param sql  要执行的SQL语句
110      * @param args  SQL语句中的参数
111      * @return  受影响的记录条数
112      * @throws Exception
113      */
114     public int executeUpdate(String sql, Object...args) throws Exception {
115         if (connection == null || connection.isClosed()) {
116             DbHelper.close();
117             throw new Exception("数据库未正常连接");
118         }
119         PreparedStatement ps = connection.prepareStatement(sql);
120         int index = 1;
121         for (Object arg : args) {
122             ps.setObject(index, arg);
123             index++;
124         }
125         return ps.executeUpdate();
126     }
127     
128     /**
129      * 获取预编译的语句对象
130      * @param sql  预编译的语句
131      * @return  预编译的语句对象
132      * @throws Exception
133      */
134     public PreparedStatement prepareStatement(String sql) throws Exception {
135         return connection.prepareStatement(sql);
136     }
137     
138     /**
139      * 关闭对象,同时将关闭连接
140      */
141     public static void close() {
142         try {
143             if (statement != null)
144                 statement.close();
145             if (connection != null) 
146                 connection.close();
147         } catch (Exception e) {
148             e.printStackTrace();
149         } finally {
150             helper = null;
151         }
152     }
153 }
DBHelper.java

别忘了放jar包

MyHander.java

  1 package sax;
  2 
  3 import java.util.ArrayList;
  4 import java.util.List;
  5 
  6 import org.xml.sax.Attributes;
  7 import org.xml.sax.SAXException;
  8 import org.xml.sax.helpers.DefaultHandler;
  9 
 10 public class MyHandler extends DefaultHandler {
 11     private List<Article> articles = new ArrayList<>();  // 学生列表
 12     private Article article = null;
 13     private int propertyOrder = 1;  
 14     // 1-id, 2-author1, 3-author2, 4-author3, 5-title, 
 15     // 6-pages, 7-year, 8-volume, 9-journal, 10-number, 11-ee$t, 12-url$t
 16     
 17     @Override
 18     public void startDocument() throws SAXException {
 19         System.out.println("文档开始");
 20     }
 21     
 22     @Override
 23     public void endDocument() throws SAXException {
 24         System.out.println("文档结束");
 25     }
 26 
 27     @Override
 28     public void startElement(String uri, String localName, String qName, Attributes attributes) throws SAXException {
 29         if (qName.equals("article")) {  // 新的article开始
 30             article = new Article();
 31             return;
 32         }
 33         switch (qName) {
 34         case "author":
 35             // 2  3  4
 36             if (propertyOrder < 4) {
 37                 propertyOrder++;
 38             }
 39             break;
 40         case "title":    
 41             propertyOrder = 5;
 42             break;
 43         case "pages":    
 44             propertyOrder = 6;
 45             break;
 46         case "year":    
 47             propertyOrder = 7;
 48             break;
 49         case "volume":    
 50             propertyOrder = 8;
 51             break;
 52         case "journal":    
 53             propertyOrder = 9;
 54             break;
 55         case "number":    
 56             propertyOrder = 10;
 57             break;
 58         case "ee":    
 59             propertyOrder = 11;
 60             break;
 61         case "url":    
 62             propertyOrder = 12;
 63             break;
 64         }
 65     }
 66     
 67     @Override
 68     public void endElement(String uri, String localName, String qName) throws SAXException {
 69         if (qName.equals("article")) {  // 新的artile结束
 70             articles.add(article);
 71             article = null;
 72             propertyOrder = 1;
 73         }
 74     }
 75     
 76     // 1-id, 2-author1, 3-author2, 4-author3, 5-title, 
 77     // 6-pages, 7-year, 8-volume, 9-journal, 10-number, 11-ee$t, 12-url$t
 78     
 79     @Override
 80     public void characters(char[] ch, int start, int length) throws SAXException {
 81         String content = new String(ch, start, length);
 82         switch (propertyOrder) {
 83         case 2:
 84             // System.out.println(content);
 85             article.setAuthor1(content);
 86             break;
 87         case 3:
 88             article.setAuthor2(content);
 89             break;
 90         case 4:
 91             article.setAuthor3(content);
 92             break;
 93         case 5:
 94             article.setTitle(content);
 95             break;
 96         case 6:
 97             article.setPages(content);
 98             break;
 99         case 7:
100             article.setYear(content);
101             break;
102         case 8:
103             article.setVolume(content);
104             break;
105         case 9:
106             article.setJournal(content);
107             break;
108         case 10:
109             article.setNumber(content);
110             break;
111         case 11:
112             article.setEe$t(content);
113             break;
114         case 12:
115             article.setUrl$t(content);
116         }
117     }
118     
119     List<Article> getArticles() {
120         return articles;
121     }
122 }
MyHander.java

因为作者最多三个,所以这里就写的1、2、3,在给他们赋值的时候用了一个小技巧,使用变量判断当前数据读到了哪里。

④SAXParserDemo.java

 1 package sax;
 2 
 3 import java.util.List;
 4 
 5 import javax.xml.parsers.SAXParser;
 6 import javax.xml.parsers.SAXParserFactory;
 7 
 8 public class SAXParserDemo {
 9 
10     public static void main(String[] args) throws Exception {
11         SAXParserFactory factory = SAXParserFactory.newInstance();
12         SAXParser parser = factory.newSAXParser();
13         MyHandler handler = new MyHandler(); // 使用自定义Handler
14         parser.parse("dblp/dblp.xml", handler);
15         System.out.println("解析完毕.");
16 
17         List<Article> articles = handler.getArticles();
18         DbHelper dbHelper = DbHelper.getDbHelper();
19         
20         System.out.println("开始写入数据库库...");
21         int cnt = 0;
22         
23         String sql = "INSERT INTO `train`.`dblp$article` (`author1`, `author2`, `author3`, `title`, `pages`, `year`, `volume`, `journal`, `number`, `ee`, `url`) values (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)";
24         for (Article article : articles) {
25             
26             // 使用?的方式传参写入数据库不需要考虑单双引号的问题。
27             Object []data = new String[]{article.getAuthor1(), 
28                     article.getAuthor2(), article.getAuthor3(), article.getTitle(), article.getPages(), article.getYear(), 
29                     article.getVolume(), article.getJournal(), article.getNumber(), article.getEe$t(), article.getUrl$t()};
30             dbHelper.executeUpdate(sql, data);
31             
32             // INSERT INTO `train`.`dblp$article` (`author1`, `author2`, `author3`, `title`, `pages`, `year`, `volume`, `journal`, `number`, `ee`, `url`)
33             //        VALUES ("Iuan-Yuan Lu", "Chih-Jen Mao", "Chun-Hsien Wang", "Intrafirm technology and knowledge transfer: a best practice perspective.", "338-356", 
34             //                 "2010", "49", "IJTM", "4", "https://doi.org/10.1504/IJTM.2010.030162", "db/journals/ijtm/ijtm49.html#LuMW10");
35             
36 //            String sql = "INSERT INTO `train`.`dblp$article` (`author1`, `author2`, `author3`, `title`, `pages`, `year`, `volume`, `journal`, `number`, `ee`, `url`)"
37 //                    + " VALUES (""
38 //                    + article.getAuthor1()
39 //                    + "", ""
40 //                    + article.getAuthor2()
41 //                    + "", ""
42 //                    + article.getAuthor3()
43 //                    + "", ""
44 //                    + article.getTitle().replaceAll(""", "\\"")
45 //                    + "", ""
46 //                    + article.getPages()
47 //                    + "", ""
48 //                    + article.getYear()
49 //                    + "", ""
50 //                    + article.getVolume()
51 //                    + "", ""
52 //                    + article.getJournal()
53 //                    + "", ""
54 //                    + article.getNumber()
55 //                    + "", ""
56 //                    + article.getEe$t()
57 //                    + "", ""
58 //                    + article.getUrl$t()
59 //                    + "");";
60             // System.out.println(sql);
61             // dbHelper.executeUpdate(sql);
62             cnt++;
63             
64             // The acceptance of "self-service" technology in the Egyptian telecom industry.
65             // 数据库返回来的错误
66         }
67         System.out.println("共" + cnt + "
, 写入数据库完成...");
68         // cnt = 3
69         DbHelper.close();
70     }
71 
72 }
View CoSAXParserDemo.java

备注写的很清楚了,如果使用注释掉的方法的话,因为之前数据(这里只复制过来了3条)中存在单引号  '  和双引号  "  ,还有斜杠  /  。使用参数写入数据库,不需要考虑单双引号和斜杠的情况。

 ⑤步骤中的所有数据:dblp.xml  点击下载地址(数据很多,下载下来去除后面的.txt即可)

 ⑥dblp.dtd  点击下载地址(下载下来去除后面的.txt即可)

原文地址:https://www.cnblogs.com/Ddlm2wxm/p/9193184.html