十分钟入门Pandas

数据选取

获取单列数据,这样会产生一个Series,类似于df.A

df['A']
"""
2019-01-01   -0.014355
2019-01-02   -0.888667
2019-01-03    0.663384
2019-01-04   -0.465610
2019-01-05   -0.404583
2019-01-06    0.898479
Freq: D, Name: A, dtype: float64
"""

通过[],可以对每一行进行切片

df[0:3]
"""
                   A         B         C         D
2019-01-01 -0.014355 -0.734472 -1.919954 -0.031567
2019-01-02 -0.888667 -0.696474  0.480344  0.664976
2019-01-03  0.663384  1.443950  0.494649  0.452302
"""
df[dates[0]:dates[1]]
"""
                   A         B         C         D
2019-01-01 -0.014355 -0.734472 -1.919954 -0.031567
2019-01-02 -0.888667 -0.696474  0.480344  0.664976
"""

通过标签获取,这时端点标签的值也会被选中

df.loc[dates[0]]
"""
A   -0.014355
B   -0.734472
C   -1.919954
D   -0.031567
Name: 2019-01-01 00:00:00, dtype: float64
"""
df.loc[dates[1]:,['A', 'B']]
"""
                   A         B
2019-01-02 -0.888667 -0.696474
2019-01-03  0.663384  1.443950
2019-01-04 -0.465610 -1.417702
2019-01-05 -0.404583 -2.395802
2019-01-06  0.898479  0.869637
"""

通过位置获取,参数是位置索引(数字)

df.iloc[1:, 1:]
"""
                   B         C         D
2019-01-02 -0.696474  0.480344  0.664976
2019-01-03  1.443950  0.494649  0.452302
2019-01-04 -1.417702 -0.317638 -0.029420
2019-01-05 -2.395802  0.222277  0.334785
2019-01-06  0.869637  0.262307  2.740970
"""

条件选择

df[df>0]
"""
                   A         B         C         D
2019-01-01       NaN       NaN       NaN       NaN
2019-01-02       NaN       NaN  0.480344  0.664976
2019-01-03  0.663384  1.443950  0.494649  0.452302
2019-01-04       NaN       NaN       NaN       NaN
2019-01-05       NaN       NaN  0.222277  0.334785
2019-01-06  0.898479  0.869637  0.262307  2.740970
"""
df[df.A>0]
"""
                   A         B         C         D
2019-01-03  0.663384  1.443950  0.494649  0.452302
2019-01-06  0.898479  0.869637  0.262307  2.740970
"""
df_cp = df.copy()
df_cp['E'] = ['One', 'One', 'Two', 'Three', 'Four', 'Four']
print(df_cp)
"""
                   A         B         C         D      E
2019-01-01 -0.014355 -0.734472 -1.919954 -0.031567    One
2019-01-02 -0.888667 -0.696474  0.480344  0.664976    One
2019-01-03  0.663384  1.443950  0.494649  0.452302    Two
2019-01-04 -0.465610 -1.417702 -0.317638 -0.029420  Three
2019-01-05 -0.404583 -2.395802  0.222277  0.334785   Four
2019-01-06  0.898479  0.869637  0.262307  2.740970   Four
"""
df_cp[df_cp.E.isin(['One'])]
"""
                   A         B         C         D    E
2019-01-01 -0.014355 -0.734472 -1.919954 -0.031567  One
2019-01-02 -0.888667 -0.696474  0.480344  0.664976  One
"""

「点点赞赏,手留余香」

    还没有人赞赏,快来当第一个赞赏的人吧!
Python