Indexing and selecting data

In this section, we will focus on how to get, set, or slice subsets of pandas data structure objects. As we learned in previous sections, Series or DataFrame objects have axis labeling information. This information can be used to identify items that we want to select or assign a new value to in the object:

>>> s4[['024', '002']]    # selecting data of Series object
024     NaN
002    Mary
dtype: object
>>> s4[['024', '002']] = 'unknown' # assigning data
>>> s4
024    unknown
065        NaN
002    unknown
001        Nam
dtype: object

If the data object is a DataFrame structure, we can also proceed in a similar way:

>>> df5[['b', 'c']]
   b  c
0  1  2
1  4  5
2  7  8

For label indexing on the rows of DataFrame, we use the ix function that enables us to select a set of rows and columns in the object. There are two parameters that we need to specify: the row and column labels that we want to get. By default, if we do not specify the selected column names, the function will return selected rows with all columns in the object:

>>> df5.ix[0]
a    0
b    1
c    2
Name: 0, dtype: int64
>>> df5.ix[0, 1:3]
b    1
c    2
Name: 0, dtype: int64

Moreover, we have many ways to select and edit data contained in a pandas object. We summarize these functions in the following table:

Method

Description

icol, irow

This selects a single row or column by integer location.

get_value, set_value

This selects or sets a single value of a data object by row or column label.

xs

This selects a single column or row as a Series by label.

Note

pandas data objects may contain duplicate indices. In this case, when we get or set a data value via index label, it will affect all rows or columns that have the same selected index name.

Indexing and selecting data
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset