Python_pandas包中Series&DataFrame的基本使用

Pandas 是python用于数据处理的拓展包
  1. series系列:比列表多了索引的概念

  1.2 列表可以转换成series,如下所示: 

import pandas as pd
my_list=[1,'two','three','l4','z5','v6']
s=pd.Series(my_list)
print(s)

输出结果是:
0        1
1      two
2    three
3       l4
4       z5
5       v6
dtype: object

  1.3 在创建series的时候,也可以自己添加索引的值:

s1=pd.Series([1,'two','three','l4','z5','v6'],
index=['A','B','C','D','E','F'])
print(s1)

结果如下: 
A        1
B      two
C    three
D       l4
E       z5
F       v6
dtype: object

  1.4 使用字典来创建series:

import pandas as pd
cities={'Beijing':55000,'Shanghai':60000,'shenzhen':50000,'Hangzhou':20000,'Guangzhou':45000,'Suzhou':None}
apts=pd.Series(cities,name='income')
print(apts)

结果如下: 
Beijing      55000.0
Shanghai     60000.0
shenzhen     50000.0
Hangzhou     20000.0
Guangzhou    45000.0
Suzhou           NaN
Name: income, dtype: float64

  1.5. 可以像对待一个list一样对待一个Series,完成各种切片的操作,其他操作类似。 

2. DataFrame:

DataFrame 类型类似于数据库表结构的数据结构,其含有行索引和列索引,可以将DataFrame 想成是由相同索引的Series组成的Dict类型。在其底层是通过二维以及一维的数据块实现。DataFrame即有行索引也有列索引,可以被看做是由Series组成的字典。

  2.1直接创建,代码如下: 

  

from pandas import DataFrame

df = DataFrame([
['a','b','c','d'],
[1,2,3,4]
])

df2 = DataFrame(df,index=['one','two'],columns=['aa','bb','cc','dd'])

#index是行索引,columns是列索引
print(df2)
print(df2.index)
print(df2.columns)

结果如下:

aa bb cc dd
one NaN NaN NaN NaN
two NaN NaN NaN NaN
Index(['one', 'two'], dtype='object')
Index(['aa', 'bb', 'cc', 'dd'], dtype='object')

  2.2.通过字典创建DataFrame

  

from pandas import DataFrame
dict1 = dict(aprt=['101', '102', '103'], profits=[1000, 2000, 3000], year=[2001, 2002, 2003], month=8)
df3 = DataFrame(dict1)
df3.index=['one','two','three']
print(df3)
## 字典的键作为DataFrame的列索引,值作为列数据

结果如下: 
      aprt  profits  year  month
one    101     1000  2001      8
two    102     2000  2002      8
three  103     3000  2003      8

  2.3. DataFrame读取csv文件的函数如下: 

  

df4=pd.read_csv('data1.csv', sep=';',encoding='UIF-8',header=None)
原文地址:https://www.cnblogs.com/spencersun/p/9593767.html