Pandas 基础学习

加载数据

Fun:pandas.read_csv

>>> import pandas
>>> food_info = pandas.read_csv("food_info.csv")
>>> print(food_info.dtypes)
NDB_No               int64
Shrt_Desc           object
Water_(g)          float64
Energ_Kcal           int64
Protein_(g)        float64
Lipid_Tot_(g)      float64
Ash_(g)            float64
Carbohydrt_(g)     float64
Fiber_TD_(g)       float64
Sugar_Tot_(g)      float64
Calcium_(mg)       float64
Iron_(mg)          float64
Magnesium_(mg)     float64
Phosphorus_(mg)    float64
Potassium_(mg)     float64
Sodium_(mg)        float64
Zinc_(mg)          float64
Copper_(mg)        float64
Manganese_(mg)     float64
Selenium_(mcg)     float64
Vit_C_(mg)         float64
Thiamin_(mg)       float64
Riboflavin_(mg)    float64
Niacin_(mg)        float64
Vit_B6_(mg)        float64
Vit_B12_(mcg)      float64
Vit_A_IU           float64
Vit_A_RAE          float64
Vit_E_(mg)         float64
Vit_D_mcg          float64
Vit_D_IU           float64
Vit_K_(mcg)        float64
FA_Sat_(g)         float64
FA_Mono_(g)        float64
FA_Poly_(g)        float64
Cholestrl_(mg)     float64
dtype: object
>>> print(type(food_info))
<class 'pandas.core.frame.DataFrame'>

取数据的头和尾

头:head

food_info.head(1)

尾:tail

food_info.tail(1)

shape

>>> food_info.shape
(8618, 36)

取数据

指定行数据

>>> print(food_info.loc[0])
NDB_No                         1001
Shrt_Desc          BUTTER WITH SALT
Water_(g)                     15.87
Energ_Kcal                      717
Protein_(g)                    0.85
Lipid_Tot_(g)                 81.11
Ash_(g)                        2.11
Carbohydrt_(g)                 0.06
Fiber_TD_(g)                      0
Sugar_Tot_(g)                  0.06
Calcium_(mg)                     24
Iron_(mg)                      0.02
Magnesium_(mg)                    2
Phosphorus_(mg)                  24
Potassium_(mg)                   24
Sodium_(mg)                     643
Zinc_(mg)                      0.09
Copper_(mg)                       0
Manganese_(mg)                    0
Selenium_(mcg)                    1
Vit_C_(mg)                        0
Thiamin_(mg)                  0.005
Riboflavin_(mg)               0.034
Niacin_(mg)                   0.042
Vit_B6_(mg)                   0.003
Vit_B12_(mcg)                  0.17
Vit_A_IU                       2499
Vit_A_RAE                       684
Vit_E_(mg)                     2.32
Vit_D_mcg                       1.5
Vit_D_IU                         60
Vit_K_(mcg)                       7
FA_Sat_(g)                   51.368
FA_Mono_(g)                  21.021
FA_Poly_(g)                   3.043
Cholestrl_(mg)                  215
Name: 0, dtype: object

取范围数据

>>> print(food_info.loc[1:2])
   NDB_No                 Shrt_Desc  Water_(g)  Energ_Kcal  Protein_(g)  
1    1002  BUTTER WHIPPED WITH SALT      15.87         717         0.85
2    1003      BUTTER OIL ANHYDROUS       0.24         876         0.28

   Lipid_Tot_(g)  Ash_(g)  Carbohydrt_(g)  Fiber_TD_(g)  Sugar_Tot_(g)  
1          81.11     2.11            0.06           0.0           0.06
2          99.48     0.00            0.00           0.0           0.00

        ...        Vit_A_IU  Vit_A_RAE  Vit_E_(mg)  Vit_D_mcg  Vit_D_IU  
1       ...          2499.0      684.0        2.32        1.5      60.0
2       ...          3069.0      840.0        2.80        1.8      73.0

   Vit_K_(mcg)  FA_Sat_(g)  FA_Mono_(g)  FA_Poly_(g)  Cholestrl_(mg)
1          7.0      50.489       23.426        3.012           219.0
2          8.6      61.924       28.732        3.694           256.0

取列数据

>>> print(food_info["NDB_No"])
0        1001
1        1002
2        1003
3        1004
4        1005
5        1006
6        1007
7        1008
8        1009
9        1010
10       1011
11       1012
12       1013
13       1014
14       1015
15       1016
16       1017
17       1018
18       1019
19       1020
20       1021
21       1022
22       1023
23       1024
24       1025
25       1026
26       1027
27       1028
28       1029
29       1030
        ...
8588    43544
8589    43546
8590    43550
8591    43566
8592    43570
8593    43572
8594    43585
8595    43589
8596    43595
8597    43597
8598    43598
8599    44005
8600    44018
8601    44048
8602    44055
8603    44061
8604    44074
8605    44110
8606    44158
8607    44203
8608    44258
8609    44259
8610    44260
8611    48052
8612    80200
8613    83110
8614    90240
8615    90480
8616    90560
8617    93600
Name: NDB_No, Length: 8618, dtype: int64

取多个列的数据

>>> print(food_info[["NDB_No","Shrt_Desc"]])
      NDB_No                                          Shrt_Desc
0       1001                                   BUTTER WITH SALT
1       1002                           BUTTER WHIPPED WITH SALT
2       1003                               BUTTER OIL ANHYDROUS
3       1004                                        CHEESE BLUE
4       1005                                       CHEESE BRICK
5       1006                                        CHEESE BRIE
6       1007                                   CHEESE CAMEMBERT
7       1008                                     CHEESE CARAWAY
8       1009                                     CHEESE CHEDDAR
9       1010                                    CHEESE CHESHIRE
10      1011                                       CHEESE COLBY
11      1012                CHEESE COTTAGE CRMD LRG OR SML CURD
12      1013                        CHEESE COTTAGE CRMD W/FRUIT
13      1014   CHEESE COTTAGE NONFAT UNCRMD DRY LRG OR SML CURD
14      1015                   CHEESE COTTAGE LOWFAT 2% MILKFAT
15      1016                   CHEESE COTTAGE LOWFAT 1% MILKFAT
16      1017                                       CHEESE CREAM
17      1018                                        CHEESE EDAM
18      1019                                        CHEESE FETA
19      1020                                     CHEESE FONTINA
20      1021                                     CHEESE GJETOST
21      1022                                       CHEESE GOUDA
22      1023                                     CHEESE GRUYERE
23      1024                                   CHEESE LIMBURGER
24      1025                                    CHEESE MONTEREY
25      1026                         CHEESE MOZZARELLA WHL MILK
26      1027                CHEESE MOZZARELLA WHL MILK LO MOIST
27      1028                   CHEESE MOZZARELLA PART SKIM MILK
28      1029               CHEESE MOZZARELLA LO MOIST PART-SKIM
29      1030                                    CHEESE MUENSTER
...      ...                                                ...
8588   43544         BABYFOOD CRL RICE W/ PEARS & APPL DRY INST
8589   43546                     BABYFOOD BANANA NO TAPIOCA STR
8590   43550                     BABYFOOD BANANA APPL DSSRT STR
8591   43566       SNACKS TORTILLA CHIPS LT (BAKED W/ LESS OIL)
8592   43570  CEREALS RTE POST HONEY BUNCHES OF OATS HONEY RSTD
8593   43572                         POPCORN MICROWAVE LOFAT&NA
8594   43585                       BABYFOOD FRUIT SUPREME DSSRT
8595   43589                               CHEESE SWISS LOW FAT
8596   43595             BREAKFAST BAR CORN FLAKE CRUST W/FRUIT
8597   43597                            CHEESE MOZZARELLA LO NA
8598   43598                           MAYONNAISE DRSNG NO CHOL
8599   44005                          OIL CORN PEANUT AND OLIVE
8600   44018                   SWEETENERS TABLETOP FRUCTOSE LIQ
8601   44048                              CHEESE FOOD IMITATION
8602   44055                                CELERY FLAKES DRIED
8603   44061           PUDDINGS CHOC FLAVOR LO CAL INST DRY MIX
8604   44074                    BABYFOOD GRAPE JUC NO SUGAR CND
8605   44110                   JELLIES RED SUGAR HOME PRESERVED
8606   44158                         PIE FILLINGS BLUEBERRY CND
8607   44203               COCKTAIL MIX NON-ALCOHOLIC CONCD FRZ
8608   44258            PUDDINGS CHOC FLAVOR LO CAL REG DRY MIX
8609   44259  PUDDINGS ALL FLAVORS XCPT CHOC LO CAL REG DRY MIX
8610   44260  PUDDINGS ALL FLAVORS XCPT CHOC LO CAL INST DRY...
8611   48052                                 VITAL WHEAT GLUTEN
8612   80200                                      FROG LEGS RAW
8613   83110                                    MACKEREL SALTED
8614   90240                         SCALLOP (BAY&SEA) CKD STMD
8615   90480                                         SYRUP CANE
8616   90560                                          SNAIL RAW
8617   93600                                   TURTLE GREEN RAW

[8618 rows x 2 columns]

取所有的列名

>>> food_info.columns.tolist()
['NDB_No', 'Shrt_Desc', 'Water_(g)', 'Energ_Kcal', 'Protein_(g)', 'Lipid_Tot_(g)', 'Ash_(g)', 'Carbohydrt_(g)', 'Fiber_TD_(g)', 'Sugar_Tot_(g)', 'Calcium_(mg)', 'Iron_(mg)', 'Magnesium_(mg)', 'Phosphorus_(mg)', 'Potassium_(mg)', 'Sodium_(mg)', 'Zinc_(mg)', 'Copper_(mg)', 'Manganese_(mg)', 'Selenium_(mcg)', 'Vit_C_(mg)', 'Thiamin_(mg)', 'Riboflavin_(mg)', 'Niacin_(mg)', 'Vit_B6_(mg)', 'Vit_B12_(mcg)', 'Vit_A_IU', 'Vit_A_RAE', 'Vit_E_(mg)', 'Vit_D_mcg', 'Vit_D_IU', 'Vit_K_(mcg)', 'FA_Sat_(g)', 'FA_Mono_(g)', 'FA_Poly_(g)', 'Cholestrl_(mg)']

排序

升序

inplace = True代表在当前对象内直接排序,如果要返回一个新的对象 set False

food_info.sort_values("Water_(g)",inplace = True)
>>> food_info["Water_(g)"]
>>> 760       0.00
8599      0.00
654       0.00
631       0.00
630       0.00
629       0.00
611       0.00
610       0.00
655       0.00
673       0.00
663       0.00
671       0.00
670       0.00
669       0.00
633       0.00
668       0.00
700       0.00
665       0.00
664       0.00
662       0.00
656       0.00
661       0.00
660       0.00
659       0.00
658       0.00
657       0.00
699       0.00
737       0.00
8122      0.00
667       0.00
         ...
4270     99.80
4411     99.85
4408     99.89
4357     99.90
4239     99.90
4356     99.90
4369     99.90
4347     99.90
4205     99.90
4203     99.93
4204     99.95
4208     99.95
4213     99.95
4374     99.96
4407     99.97
4379     99.97
4373     99.97
4404     99.98
4372     99.98
4377    100.00
4378    100.00
4348    100.00
4209    100.00
4376    100.00
6150       NaN
6067       NaN
6113       NaN
1983       NaN
7776       NaN
6095       NaN

降序

>>> food_info.sort_values("Water_(g)",inplace = True , ascending = False)
>>> food_info["Water_(g)"]
4376    100.00
4209    100.00
4348    100.00
4378    100.00
4377    100.00
4372     99.98
4404     99.98
4407     99.97
4379     99.97
4373     99.97
4374     99.96
4213     99.95
4208     99.95
4204     99.95
4203     99.93
4356     99.90
4357     99.90
4239     99.90
4205     99.90
4369     99.90
4347     99.90
4408     99.89
4411     99.85
4270     99.80
4252     99.80
4392     99.80
4260     99.80
4409     99.79
4255     99.74
4397     99.70
         ...
739       0.00
790       0.00
638       0.00
689       0.00
688       0.00
687       0.00
686       0.00
685       0.00
666       0.00
632       0.00
653       0.00
639       0.00
696       0.00
8455      0.00
791       0.00
675       0.00
8180      0.00
704       0.00
705       0.00
706       0.00
707       0.00
738       0.00
6417      0.00
760       0.00
6150       NaN
6067       NaN
6113       NaN
1983       NaN
7776       NaN
6095       NaN
原文地址:https://www.cnblogs.com/zfcode/p/Pandas-ji-chu-xue-xi.html