载入数据
(一)打开文件
(二) 打开url
(三) 打开数据库
(四)从一些数据生成器(DataGenerators)中生成人造数据
这篇主要写(三)中的连接mySql
网上教程很多这里写个简单步骤
推荐一篇文章
http://blog.csdn.net/senaku/article/details/2225943
下载必要的工具 mysql驱动(有的就别下了)
http://pan.baidu.com/share/link?shareid=2530503288&uk=1010575044
weka 3.6.10的(无jre版)
http://pan.baidu.com/share/link?shareid=2578038226&uk=1010575044
jdk
http://pan.baidu.com/share/link?shareid=2585337924&uk=1010575044
下好安装就不介绍了,先安装jdk,配置环境变量,在安装weka。
配置环境变量
JAVA_HOME
D:Program FilesJavajdk1.7.0_09
ClassPath
.
下载完后,在weka的安装目录下"D:Program FilesWeka-3-6"新建lib文件夹,将jar包复制到lib文件夹下,并且在"D:Program FilesJavajdk1.7.0_09jrelibext"下也放mysql-connector-java-5.1.6-bin.jar
设置环境变量
JAVA_HOME
D:Program FilesJavajdk1.7.0_09
WEKA_HOME
D:Program FilesWeka-3-6
ClassPath
.;%WEKA_HOME%libmysql-connector-java-5.1.26-bin.jar;%JAVA_HOME%jrelibextmysql-connector-java-5.1.26-bin.jar;
设置完成后,weka就能找到放在lib中的数据库jar包了.
在C:Program FilesWeka-3-6weka.jarexperiment里找到DatabaseUtils.props.mysql,将其名字改成DatabaseUtils.props,替换原有的DatabaseUtils.props文件,并将其修改文件里的以下内容:
# Database settings for MySQL 3.23.x, 4.x # # General information on database access can be found here: # http://weka.wikispaces.com/Databases # # url: http://www.mysql.com/ # jdbc: http://www.mysql.com/products/connector/j/ # author: Fracpete (fracpete at waikato dot ac dot nz) # version: $Revision: 5836 $ # JDBC driver (comma-separated list) #jdbcDriver=org.gjt.mm.mysql.Driver jdbcDriver=com.mysql.jdbc.Driver # database URL #jdbcURL=jdbc:mysql://server_name:3306/database_name jdbcURL=jdbc:mysql://localhost:3306/rtest # specific data types # string, getString() = 0; --> nominal # boolean, getBoolean() = 1; --> nominal # double, getDouble() = 2; --> numeric # byte, getByte() = 3; --> numeric # short, getByte()= 4; --> numeric # int, getInteger() = 5; --> numeric # long, getLong() = 6; --> numeric # float, getFloat() = 7; --> numeric # date, getDate() = 8; --> date # text, getString() = 9; --> string # time, getTime() = 10; --> date # specific data types string, getString() = 0; --> nominal boolean, getBoolean() = 1; --> nominal double, getDouble() = 2; --> numeric byte, getByte() = 3; --> numeric short, getByte()= 4; --> numeric int, getInteger() = 5; --> numeric long, getLong() = 6; --> numeric float, getFloat() = 7; --> numeric date, getDate() = 8; --> date text, getString() = 9; --> string time, getTime() = 10; --> date TINYINT=3 SMALLINT=4 #SHORT=4 SHORT=5 INTEGER=5 INT=5 INT_UNSIGNED=6 BIGINT=6 LONG=6 REAL=7 NUMERIC=2 DECIMAL=2 FLOAT=2 DOUBLE=2 CHAR=0 TEXT=0 VARCHAR=0 LONGVARCHAR=9 BINARY=0 VARBINARY=0 LONGVARBINARY=9 BIT=1 BLOB=9 DATE=8 TIME=8 DATETIME=8 TIMESTAMP=8 # other options CREATE_DOUBLE=DOUBLE CREATE_STRING=TEXT CREATE_INT=INT CREATE_DATE=DATETIME DateFormat=yyyy-MM-dd HH:mm:ss checkUpperCaseNames=false checkLowerCaseNames=false checkForTable=true # All the reserved keywords for this database # Based on the keywords listed at the following URL (2009-04-13): # http://dev.mysql.com/doc/mysqld-version-reference/en/mysqld-version-reference-reservedwords-5-0.html Keywords= ADD, ALL, ALTER, ANALYZE, AND, AS, ASC, ASENSITIVE, BEFORE, BETWEEN, BIGINT, BINARY, BLOB, BOTH, BY, CALL, CASCADE, CASE, CHANGE, CHAR, CHARACTER, CHECK, COLLATE, COLUMN, COLUMNS, CONDITION, CONNECTION, CONSTRAINT, CONTINUE, CONVERT, CREATE, CROSS, CURRENT_DATE, CURRENT_TIME, CURRENT_TIMESTAMP, CURRENT_USER, CURSOR, DATABASE, DATABASES, DAY_HOUR, DAY_MICROSECOND, DAY_MINUTE, DAY_SECOND, DEC, DECIMAL, DECLARE, DEFAULT, DELAYED, DELETE, DESC, DESCRIBE, DETERMINISTIC, DISTINCT, DISTINCTROW, DIV, DOUBLE, DROP, DUAL, EACH, ELSE, ELSEIF, ENCLOSED, ESCAPED, EXISTS, EXIT, EXPLAIN, FALSE, FETCH, FIELDS, FLOAT, FLOAT4, FLOAT8, FOR, FORCE, FOREIGN, FROM, FULLTEXT, GOTO, GRANT, GROUP, HAVING, HIGH_PRIORITY, HOUR_MICROSECOND, HOUR_MINUTE, HOUR_SECOND, IF, IGNORE, IN, INDEX, INFILE, INNER, INOUT, INSENSITIVE, INSERT, INT, INT1, INT2, INT3, INT4, INT8, INTEGER, INTERVAL, INTO, IS, ITERATE, JOIN, KEY, KEYS, KILL, LABEL, LEADING, LEAVE, LEFT, LIKE, LIMIT, LINES, LOAD, LOCALTIME, LOCALTIMESTAMP, LOCK, LONG, LONGBLOB, LONGTEXT, LOOP, LOW_PRIORITY, MATCH, MEDIUMBLOB, MEDIUMINT, MEDIUMTEXT, MIDDLEINT, MINUTE_MICROSECOND, MINUTE_SECOND, MOD, MODIFIES, NATURAL, NOT, NO_WRITE_TO_BINLOG, NULL, NUMERIC, ON, OPTIMIZE, OPTION, OPTIONALLY, OR, ORDER, OUT, OUTER, OUTFILE, PRECISION, PRIMARY, PRIVILEGES, PROCEDURE, PURGE, READ, READS, REAL, REFERENCES, REGEXP, RELEASE, RENAME, REPEAT, REPLACE, REQUIRE, RESTRICT, RETURN, REVOKE, RIGHT, RLIKE, SCHEMA, SCHEMAS, SECOND_MICROSECOND, SELECT, SENSITIVE, SEPARATOR, SET, SHOW, SMALLINT, SONAME, SPATIAL, SPECIFIC, SQL, SQLEXCEPTION, SQLSTATE, SQLWARNING, SQL_BIG_RESULT, SQL_CALC_FOUND_ROWS, SQL_SMALL_RESULT, SSL, STARTING, STRAIGHT_JOIN, TABLE, TABLES, TERMINATED, THEN, TINYBLOB, TINYINT, TINYTEXT, TO, TRAILING, TRIGGER, TRUE, UNDO, UNION, UNIQUE, UNLOCK, UNSIGNED, UPDATE, UPGRADE, USAGE, USE, USING, UTC_DATE, UTC_TIME, UTC_TIMESTAMP, VALUES, VARBINARY, VARCHAR, VARCHARACTER, VARYING, WHEN, WHERE, WHILE, WITH, WRITE, XOR, YEAR_MONTH, ZEROFILL # The character to append to attribute names to avoid exceptions due to # clashes between keywords and attribute names KeywordsMaskChar=_ #flags for loading and saving instances using DatabaseLoader/Saver nominalToStringLimit=50 idColumn=auto_generated_id
运行weka,选择open DB,选择user,我所用的用户名和密码分别是root(安装数据库时默认的,我没改就用的root)和mysql(安装mysql时设置的密码),点击connect,info显示connecting to:jdbc:mysql://localhost:3306/myweka = true,代表连接成功
连上后在"Query"一栏中写入SQL查询语句,例如
select * from user;
然后点右边的"Excute"按钮。(如果连接数据库不成功是无法Excute的)
"Result"一栏中会显示有关结果,满意后点击最下方的"OK"按钮。Explorer就从数据库中载入数据集了。
********************************************
我自己在操作的时候遇到一个INT UNSIGNED dataType没有定义的问题,解决方法:
有时候点击OK 的时候提示没有dataType原因是weka不识别,你可以在上面的文件里添加对应的数据类型即可。
**************************************************************
其他的数据库操作步骤基本一致,就不介绍了。