Python文件读取

读取文件Advertising.csv，文件内容类似于：

 1 ,TV,Radio,Newspaper,Sales
 2 1,230.1,37.8,69.2,22.1
 3 2,44.5,39.3,45.1,10.4
 4 3,17.2,45.9,69.3,9.3
 5 4,151.5,41.3,58.5,18.5
 6 5,180.8,10.8,58.4,12.9
 7 6,8.7,48.9,75,7.2
 8 7,57.5,32.8,23.5,11.8
 9 8,120.2,19.6,11.6,13.2
10 9,8.6,2.1,1,4.8
11 10,199.8,2.6,21.2,10.6
12 11,66.1,5.8,24.2,8.6
13 12,214.7,24,4,17.4
14 13,23.8,35.1,65.9,9.2
15 14,97.5,7.6,7.2,9.7
16 15,204.1,32.9,46,19
17 16,195.4,47.7,52.9,22.4
18 17,67.8,36.6,114,12.5
19 18,281.4,39.6,55.8,24.4
20 19,69.2,20.5,18.3,11.3
21 20,147.3,23.9,19.1,14.6

View Code

手动读取：

 1 path = '8.Advertising.csv'
 2 f = file(path)
 3     x = []
 4     y = []
 5     for i, d in enumerate(f):
 6         if i == 0: #第一行是标题栏
 7             continue
 8         d = d.strip() #去除首位空格
 9         if not d:
10             continue
11         d = map(float, d.split(',')) #每个数据都变为float
12         x.append(d[1:-1])
13         y.append(d[-1])

View Code

python自带库：

1 f = file(path, 'rb')
2     print f
3     d = csv.reader(f)
4     for line in d:
5         print line
6     f.close()

View Code

numpy:

1  p = np.loadtxt(path, delimiter=',', skiprows=1)
2     print p

View Code

pandas:

1 data = pd.read_csv(path)    # TV、Radio、Newspaper、Sales
2     x = data[['TV', 'Radio', 'Newspaper']]
3     # x = data[['TV', 'Radio']]
4     y = data['Sales']

View Code

使用sklearn作文件预处理：

1 from sklearn.preprocessing import StandardScaler
2  le = preprocessing.LabelEncoder()
3     le.fit(['Iris-setosa', 'Iris-versicolor', 'Iris-virginica'])
4     print le.classes_
5     y = le.transform(y)

View Code