<div style="font-size:16px;">
<p>1:多重索引的构造</p>
<p>>>> #下面显示构造pd.MultiIndex</p>
<p>>>> df1=DataFrame(np.random.randint(0,150,size=(6,3)),columns=['java','html5','python'])</p>
<p>>>> import pandas as pd</p>
<p>>>> df1=DataFrame(np.random.randint(0,150,size=(6,3)),columns=['java','html5','python'],index=pd.MultiIndex.from_arrays([['张三','张三','侯少','侯少','a','a'],['M','E','M','E','M','E']]))</p>
<p>>>> df1#因为Python自身的原因,对汉字的识别不是太好,所以汉字被?代替了</p>
<p>java html5 python</p>
<p>???? M 2 13 76</p>
<p>E 141 67 84</p>
<p>M 116 83 8</p>
<p>E 70 118 125</p>
<p>a M 74 0 76</p>
<p>E 111 31 8</p>
<p>>>> #使用元组tuple创建</p>
<p>df2=DataFrame(np.random.randint(0,150,size=(6,3)),columns=['java','html','python'],index=pd.MultiIndex.from_tuples([('a','1'),('a','11'),('b','1'),('b','11'),('c','1'),('c','11')]))</p>
<p>>>> df2</p>
<p>java html python</p>
<p>a 1 32 144 99</p>
<p>11 104 101 16</p>
<p>b 1 93 98 41</p>
<p>11 59 30 45</p>
<p>c 1 91 17 149</p>
<p>11 9 28 59</p>
<p>>>> #使用product</p>
<p>df2=DataFrame(np.random.randint(0,150,size=(6,3)),columns=['java','html','python'],index=pd.MultiIndex.from_product([['zhangsan ','lisi','wangwu'],['mid','end']]))</p>
<p>>>> df2</p>
<p>java html python</p>
<p>zhangsan mid 50 128 54</p>
<p>end 3 4 91</p>
<p>lisi mid 4 93 110</p>
<p>end 116 123 122</p>
<p>wangwu mid 88 25 54</p>
<p>end 48 146 57</p>
<p>>>> #对dataFrame同样可以设置成多重索引</p>
<p>df2=DataFrame(np.random.randint(0,150,size=(3,6)),columns=pd.MultiIndex.from_product([['java','html','python'],['mid','end']]),index=['张三','李四','王五'])</p>
<p>>>> df2</p>
<p>java html python</p>
<p>mid end mid end mid end</p>
<p>???? 33 38 112 70 113 110</p>
<p>???? 29 46 132 91 117 128</p>
<p>???? 73 56 118 82 132 39</p>
<p>>>></p>
<p>>>> df2['java','mid']#查询某一列</p>
<p>???? 33</p>
<p>???? 29</p>
<p>???? 73</p>
<p>Name: (java, mid), dtype: int32</p>
<p>>>> s['zhangsan':'lisi']#其实就是一个Series</p>
<p>Series([], dtype: int64)</p>
<p>>>> s.iloc[0:3]</p>
<p>a 0 1</p>
<p>1 2</p>
<p>b 0 3</p>
<p>dtype: int64</p>
<p>>>> #切片</p>
<p>>>> df2['张三':'王五']</p>
<p>java html python</p>
<p>mid end mid end mid end</p>
<p>???? 33 38 112 70 113 110</p>
<p>???? 29 46 132 91 117 128</p>
<p>???? 73 56 118 82 132 39</p>
<p>>>>df2.iloc[0:4]#推荐使用</p>
<p>Df2[‘张三’,‘期中’]和df2.loc[‘张三’].loc[‘期中’]</p>
<p>#如何一级索引有多个,对二级索引会遇到问题,也就是说,无法直接对二级进行索引</p>
<p>必须把二级索引变成一级索引才可以进行索引</p>
<p>>>> df2.stack()</p>
<p>html java python</p>
<p>???? end 70 38 110</p>
<p>mid 112 33 113</p>
<p>end 91 46 128</p>
<p>mid 132 29 117</p>
<p>end 82 56 39</p>
<p>mid 118 73 132</p>
<p>>>> #stack =堆----》行</p>
<p>end mid</p>
<p>???? html 70 112</p>
<p>java 38 33</p>
<p>python 110 113</p>
<p>html 91 132</p>
<p>java 46 29</p>
<p>python 128 117</p>
<p>html 82 118</p>
<p>java 56 73</p>
<p>python 39 132</p>
<p>>>> #默认为-1</p>
<p>2:多重索引的计算</p>
<p>>>> df2</p>
<p>java html python</p>
<p>mid end mid end mid end</p>
<p>???? 33 38 112 70 113 110</p>
<p>???? 29 46 132 91 117 128</p>
<p>???? 73 56 118 82 132 39</p>
<p>>>> df1.sum()</p>
<p>java 514</p>
<p>html5 312</p>
<p>python 377</p>
<p>dtype: int64</p>
<p>>>> df1.sum(axis=0)</p>
<p>java 514</p>
<p>html5 312</p>
<p>python 377</p>
<p>dtype: int64</p>
<p>>>> df1.sum(axis=1)#对列</p>
<p>???? M 91</p>
<p>E 292</p>
<p>M 207</p>
<p>E 313</p>
<p>a M 150</p>
<p>E 150</p>
<p>dtype: int64</p>
<p>>>> df1.sum(axis=1)#对列求和,得到每行的和</p>
<p>???? M 91</p>
<p>E 292</p>
<p>M 207</p>
<p>E 313</p>
<p>a M 150</p>
<p>E 150</p>
<p>dtype: int64</p>
<p>>>> df1.std</p>
<p></p>
<p>???? M 2 13 76</p>
<p>E 141 67 84</p>
<p>M 116 83 8</p>
<p>E 70 118 |
|