Python数据处理之pd.Series()函数的基本使用

目录
  • 1.Series介绍
  • 2.Series创建
    • 1.pd.Series([list],index=[list])
    • 2.pd.Series(np.arange())
  • 3 Series基本属性
  • 4 索引
  • 5 计算、描述性统计
  • 6 排序
  • 总结

1.Series介绍

Pandas模块的数据结构主要有两种:1.Series 2.DataFrame

Series 是一维数组,基于Numpy的ndarray 结构

Series([data, index, dtype, name, copy, …])
# One-dimensional ndarray with axis labels (including time series).

2.Series创建

import Pandas as pd
import numpy as np

1.pd.Series([list],index=[list])

参数为list ,index为可选参数,若不填写则默认为index从0开始

obj = pd.Series([4, 7, -5, 3, 7, np.nan])
obj

输出结果为:

0    4.0
1    7.0
2   -5.0
3    3.0
4    7.0
5    NaN
dtype: float64

2.pd.Series(np.arange())

arr = np.arange(6)
s = pd.Series(arr)
s

输出结果为:

0    0
1    1
2    2
3    3
4    4
5    5
dtype: int32

pd.Series({dict})
d = {'a':10,'b':20,'c':30,'d':40,'e':50}
s = pd.Series(d)
s

输出结果为:

a    10
b    20
c    30
d    40
e    50
dtype: int64

可以通过DataFrame中某一行或者某一列创建序列

3 Series基本属性

  • Series.values:Return Series as ndarray or ndarray-like depending on the dtype
obj.values
# array([ 4.,  7., -5.,  3.,  7., nan])
  • Series.index:The index (axis labels) of the Series.
obj.index
# RangeIndex(start=0, stop=6, step=1)
  • Series.name:Return name of the Series.

4 索引

  • Series.loc:Access a group of rows and columns by label(s) or a boolean array.
  • Series.iloc:Purely integer-location based indexing for selection by position.

5 计算、描述性统计

Series.value_counts:Return a Series containing counts of unique values.

index = ['Bob', 'Steve', 'Jeff', 'Ryan', 'Jeff', 'Ryan']
obj = pd.Series([4, 7, -5, 3, 7, np.nan],index = index)
obj.value_counts()

输出结果为:

7.0    2
 3.0    1
-5.0    1
 4.0    1
dtype: int64

6 排序

Series.sort_values

Series.sort_values(self, axis=0, ascending=True, inplace=False, kind='quicksort', na_position='last')

Parameters:

Parameters Description
axis {0 or ‘index’}, default 0,Axis to direct sorting. The value ‘index’ is accepted for compatibility with DataFrame.sort_values.
ascendin bool, default True,If True, sort values in ascending order, otherwise descending.
inplace bool, default FalseIf True, perform operation in-place.
kind {‘quicksort’, ‘mergesort’ or ‘heapsort’}, default ‘quicksort’Choice of sorting algorithm. See also numpy.sort() for more information. ‘mergesort’ is the only stable algorithm.
na_position {‘first’ or ‘last’}, default ‘last’,Argument ‘first’ puts NaNs at the beginning, ‘last’ puts NaNs at the end.

Returns:

Series:Series ordered by values.

obj.sort_values()

输出结果为:

Jeff    -5.0
Ryan     3.0
Bob      4.0
Steve    7.0
Jeff     7.0
Ryan     NaN
dtype: float64

  • Series.rank
Series.rank(self, axis=0, method='average', numeric_only=None, na_option='keep', ascending=True, pct=False)[source]

Parameters:

Parameters Description
axis {0 or ‘index’, 1 or ‘columns’}, default 0Index to direct ranking.
method {‘average’, ‘min’, ‘max’, ‘first’, ‘dense’}, default ‘average’How to rank the group of records that have the same value (i.e. ties): average, average rank of the group; min: lowest rank in the group; max: highest rank in the group; first: ranks assigned in order they appear in the array; dense: like ‘min’, but rank always increases by 1,between groups
numeric_only bool, optional,For DataFrame objects, rank only numeric columns if set to True.
na_option {‘keep’, ‘top’, ‘bottom’}, default ‘keep’, How to rank NaN values:;keep: assign NaN rank to NaN values; top: assign smallest rank to NaN values if ascending; bottom: assign highest rank to NaN values if ascending
ascending bool, default True Whether or not the elements should be ranked in ascending order.
pct bool, default False Whether or not to display the returned rankings in percentile form.

Returns:

same type as caller :Return a Series or DataFrame with data ranks as values.

# obj.rank()            #从大到小排,NaN还是NaN
obj.rank(method='dense')
# obj.rank(method='min')
# obj.rank(method='max')
# obj.rank(method='first')
# obj.rank(method='dense')

输出结果为:

Bob      3.0
Steve    4.0
Jeff     1.0
Ryan     2.0
Jeff     4.0
Ryan     NaN
dtype: float64

总结

到此这篇关于Python数据处理之pd.Series()函数的基本使用的文章就介绍到这了,更多相关Python pd.Series()函数内容请搜索我们以前的文章或继续浏览下面的相关文章希望大家以后多多支持我们!

时间: 2022-06-22

python 遍历pd.Series的index和value

遍历pd.Series的index和value的方法如下,python built-in list的enumerate方法不管用 for i, v in s.items(): print('index: ', i, 'value: ', v) #index: a value: 1 #index: b value: 2 #index: c value: 3 #index: d value: 4 for i, v in s.iteritems(): print('index: ', i, 'valu

python pandas 对series和dataframe的重置索引reindex方法

reindex更多的不是修改pandas对象的索引,而只是修改索引的顺序,如果修改的索引不存在就会使用默认的None代替此行.且不会修改原数组,要修改需要使用赋值语句. series.reindex() import pandas as pd import numpy as np obj = pd.Series(range(4), index=['d', 'b', 'a', 'c']) print obj d 0 b 1 a 2 c 3 dtype: int64 print obj.reinde

python遍历序列enumerate函数浅析

enumerate函数用于遍历序列中的元素以及它们的下标. enumerate函数说明: 函数原型:enumerate(sequence, [start=0]) 功能:将可循环序列sequence以start开始分别列出序列数据和数据下标 即对一个可遍历的数据对象(如列表.元组或字符串),enumerate会将该数据对象组合为一个索引序列,同时列出数据和数据下标. 举例说明: 存在一个sequence,对其使用enumerate将会得到如下结果: start        sequence[0]

python 遍历列表提取下标和值的实例

如下所示: for index,value in enumerate(['apple', 'oppo', 'vivo']): print(index,value) 以上这篇python 遍历列表提取下标和值的实例就是小编分享给大家的全部内容了,希望能给大家一个参考,也希望大家多多支持我们.

python遍历 truple list dictionary的几种方法总结

实例如下: def TestDic1(): dict2 ={'aa':222,11:222} for val in dict2: print val def TestDic2(): dict2 ={'aa':222,11:222} for (key,val) in dict2.items(): print key,":",val def TestList1(): list=[1,2,3,4,5,3,2,'ada','fs3'] for i in range(len(list)): pr

解决Python 遍历字典时删除元素报异常的问题

错误的代码① d = {'a':1, 'b':0, 'c':1, 'd':0} for key, val in d.items(): del(d[k]) 错误的代码② -- 对于Python3 d = {'a':1, 'b':0, 'c':1, 'd':0} for key, val in d.keys(): del(d[k]) 正确的代码 d = {'a':1, 'b':0, 'c':1, 'd':0} keys = list(d.keys()) for key, val in keys: d

Python遍历目录并批量更换文件名和目录名的方法

本文实例讲述了Python遍历目录并批量更换文件名和目录名的方法.分享给大家供大家参考,具体如下: #encoding=utf-8 #author: walker #date: 2014-03-07 #summary: 深度遍历指定目录,并将子目录和文件名改为小写 #注意,此程序只针对windows,windows下文件(夹)名不区分大小写 import os import os.path import shutil #读入指定目录并转换为绝对路径 rootdir = raw_input('ro

python遍历目录的方法小结

本文实例总结了python遍历目录的方法.分享给大家供大家参考,具体如下: 方法一使用递归: """ def WalkDir( dir, dir_callback = None, file_callback = None ): for item in os.listdir( dir ): print item; fullpath = dir + os.sep + item if os.path.isdir( fullpath ): WalkDir( fullpath, dir

python遍历数组的方法小结

本文实例总结了python遍历数组的方法.分享给大家供大家参考.具体分析如下: 下面介绍两种遍历数组的方法,一种是直接通过for in 遍历数组,另外一种是通过rang函数先获得数组长度,在根据索引遍历数组 第一种,最常用的,通过for in遍历数组 colours = ["red","green","blue"] for colour in colours: print colour # red # green # blue 下面的方法可以先获

python遍历类中所有成员的方法

本文实例讲述了python遍历类中所有成员的方法.分享给大家供大家参考.具体分析如下: 这段代码自定义了一个类,类包含了两个成员title和url,在类的内部定义了一个函数list_all_member用于输出类的所有成员变量及值 # -*- coding: utf-8 -*- class Site(object): def __init__(self): self.title = 'jb51 js code' self.url = 'http://www.jb51.net' def list_